Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: UTF-8 question

Re: UTF-8 question

From: Frank van Bortel <frank.van.bortel_at_gmail.com>
Date: Sun, 11 Dec 2005 15:58:35 +0100
Message-ID: <dnhe7d$cvs$1@news2.zwoll1.ov.home.nl>


lsllcm wrote:
> I have a oracle test server 10.2.0.1 on windows 2000, created a db with
> UTF-8 charset.
>
> CREATE TABLE TEST (C1 VARCHAR2(20);
That's not how you created the table...
It is not a valid statement!

>
> INSERT INTO TEST VALUES ('豐'); --BIG5 tranditional
> chinese font
>
> INSERT INTO TEST VALUES ('榮');
>
> INSERT INTO TEST VALUES ('张三'); --simplied chinese font
>
> SQL> select dump(c1,1016) from test;
>
> DUMP(C1,1016)
> -------------------------------------------------------------------
>
> Typ=1 Len=2 CharacterSet=AL32UTF8: d8,53
> Typ=1 Len=2 CharacterSet=AL32UTF8: 98,73
> Typ=1 Len=4 CharacterSet=AL32UTF8: d5,c5,c8,fd
>
> SQL> select length(c1) from test;
>
> LENGTH(C1)
> -----------
> 1
> 2
> 2
> 2
>
> SQL> select lengthb(c1) from test;
>
> LENGTHB(C1)
> -----------
> 2
> 2
> 4
> 2
>
> 1. The lengthb and length has different result result.

Yes - so? They are different functions.

> 2. The first row and second row has two bytes space, but the length
> function return 1 and 2 value, it is strange. Can anyone help on this
> one?
>
> Thanks in advance!
>

AL32UTF8 is not a fixed, but a variable length format character set. 1 character can be represented by 1,2,3 or even 4 bytes. Lengthb shows you that, as documented.
http://download-uk.oracle.com/docs/cd/B19306_01/server.102/b14200/functions076.htm#sthref1446 Also, using an UTF character set, use lengthc, not length

-- 
Regards,
Frank van Bortel

Top-posting is one way to shut me up...
Received on Sun Dec 11 2005 - 08:58:35 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US