Oracle Bug? RPAD of Japanese (kanji) character in Oracle 10gR2 UTF8 database
Date: Fri, 16 Jan 2009 10:23:41 -0800 (PST)
Message-ID: <e626ed27-81de-44a9-8d4a-f3e8d017c093_at_z27g2000prd.googlegroups.com>
[Quoted] I've searched metalink and not seen a mention, but want to make sure I'm not missing anything obvious in calling this a bug. CHR(15121570) is the UTF8 character point representing kanji character '漢'. It seems like RPAD is not properly padding it out to a full 4 characters in the example below:
SELECT RPAD(CHR(15121570),4,'*') FROM DUAL;
RESULT:
漢**
A LENGTH() function reveals that the RPAD is only creating a string with 3 characters:
SELECT LENGTH(RPAD(CHR(15121570),4,'*')) FROM DUAL;
RESULT:
3
The same logic against a different multi-byte UTF8 character, the Microsoft ellipse, shows the expected behavior:
SELECT RPAD(CHR(14844070),4,'*') FROM DUAL;
RESULT:
…***
SELECT LENGTH(RPAD(CHR(14844070),4,'*')) FROM DUAL;
RESULT:
4
(note - i'm using '*' to make the RPAD functionality more visible; the
same behavior occurs with the default blank-space, eg RPAD(CHR
(15121570),4))
Verifying the UTF8 encoding:
select * from nls_database_parameters where parameter =
'NLS_CHARACTERSET';
RESULT:
NLS_CHARACTERSET UTF8
So, the question - does anyone see any obvious oversight on my part,
or should I consider this a "probable bug" in need of a TAR?
Received on Fri Jan 16 2009 - 19:23:41 CET