Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: question on multibyte character storage

RE: question on multibyte character storage

From: Faan DeSwardt <Faan_D1_at_VERIFONE.com>
Date: Tue, 14 Oct 2003 13:04:24 -0800
Message-ID: <F001.005D3158.20031014130424@fatcity.com>


Elain,

If you are using Oracle9i then I would recommend that you change to character semantics vs. byte semantics so that Oracle will take care of this and make your life, and especially the life of your developers, much easier. It will also ensure that many "logical bugs" that are only discovered at runtime are avoided to a great extent, which are not easy to detect in early application testing. Check the details at URL http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96529/ch2.htm#104327.

Basically, when using character semantics the variable defined of type VARCHAR2(1) will store any one character, irrespective of how many bytes it takes, even if it takes 4 bytes in the extreme case. I will also strongly advise not using UTF8 but recommend using the more recent Oracle implementation of the UTF 3.1 standard called AL32UTF8. This character set can store more characters than UTF8 in 4 bytes that would in some cases require UTF8 to use 6 bytes i.e. use the space of 2 characters in the storage structure to represent one character. Note that this only happens in very rare cases and the reason why it was not initially implemented in the UTF standard 3.0.

One exception will be when you perform extensive string comparisons and/or are primarily storing text of Asian languages then definitely consider using the character set AL16UTF16.

As usual I can highly recommend the friendly manual at URL http://download-west.oracle.com/docs/cd/B10501_01/server.920/a96529/toc.htm which gives a good overview of the concepts and features available for globalization support in Oracle9i.

Good Luck!

-f

-----Original Message-----

Sent: Tuesday, October 14, 2003 12:29 PM To: Multiple recipients of list ORACLE-L

Hi,
We are planning on storing 2-byte characterset in our UTF8 database. If I store a single two-byte character in a varchar2 column, do I need to declare the column as VARCHAR2(1) or VARCHAR2(2)? As far as I know since it's a UTF8 database, a VARCHAR2(1) should be sufficient, right?

Thanks.

elain



See when your friends are online with MSN Messenger 6.0. Download it now FREE! http://msnmessenger-download.com

--

Please see the official ORACLE-L FAQ: http://www.orafaq.net
--

Author: elain he
  INET: elainhe_at_hotmail.com

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services

---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).

--

Please see the official ORACLE-L FAQ: http://www.orafaq.net
--

Author: Faan DeSwardt
  INET: Faan_D1_at_VERIFONE.com

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services

---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing). Received on Tue Oct 14 2003 - 16:04:24 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US