Re: Difference between VARCHAR2(xx CHAR) and NVARCHAR2(xx)

From: Álvaro G. Vicario <alvaro.NOSPAMTHANX_at_demogracia.com.invalid>
Date: Mon, 27 Dec 2010 16:58:25 +0100
Message-ID: <ifad33$o7b$1_at_news.eternal-september.org>



El 22/12/2010 14:18, ddf escribió/wrote:
> On Dec 22, 6:54 am, "Álvaro G. Vicario"
> <alvaro.NOSPAMTH..._at_demogracia.com.invalid> wrote:
>> I've just installed Oracle XE (Oracle Database 10g Express Edition
>> Universal). I've changed nothing so it's the using the character sets
>> specified in the default settings:
>>
>> SELECT * FROM NLS_DATABASE_PARAMETERS;
>>
>> NLS_CHARACTERSET AL32UTF8
>> NLS_NCHAR_CHARACTERSET AL16UTF16
>>
>> Given that both CHAR and NCHAR data types seem to accept multi-byte
>> strings (UTF-8 and UTF-16), what is the exact difference between these
>> two column definitions?
>>
>> VARCHAR2(10 CHAR)
>> NVARCHAR2(10)
>>
>> I've found similar questions in many forums but they always explain CHAR
>> vs VARCHAR or mention that NVARCHAR2 accepts multi-byte strings.
>>
>> My educated guess is VARCHAR2 is a legacy type that did not accept
>> multi-byte, NVARCHAR2 was added, then VARCHAR2 was enhanced to support
>> multi-byte... And also VARCHAR2 depends on the server version to be able
>> to store multi-byte while NVARCHAR2 is always available.
>>
>> Am I right?

> I can't say whether you're 'right' or not, but I do know this:
>
> UTF-16: Each character is either 2 or 4 bytes long.
>
> UTF-8: Each character takes 1 to 4 bytes to store.
>
> Thus any character in a UTF-16 implementation (your NVARCHAR2 type)
> will occupy no less than 2 bytes of storage, where your VARCHAR2
> characters (AL32UTF8) can be stored in as little as 1 byte which may
> not be important to you now but could become so in the future. Both
> can support multiple languages without character loss however the
> database character set is also used to identify SQL and PL/SQL source
> code. In order to do this, it must have either EBCDIC or 7-bit ASCII
> as a subset, whichever is native to the platform; it is not possible
> to use a fixed-width, multibyte character set as the database
> character set such as AL16UTF16. Thus your AL32UTF8 character set
> allows for 7-bit ASCII characters and doesn't interfere with object
> names, user names, etc.

I'm not sure I understand this. I assume you are talking about using object names that contain international chars, such as:

        CREATE TABLE "año" (...)

:-?

> Also the N-types (NVARCHAR2, NCHAR, NCLOB)
> will accept and stored multi-byte characters regardless of the
> database character set in use making it possible to support single-
> byte and multi-byte languages.

It makes sense.

Thank you!

-- 
-- http://alvaro.es - Álvaro G. Vicario - Burgos, Spain
-- Mi sitio sobre programación web: http://borrame.com
-- Mi web de humor satinado: http://www.demogracia.com
--
Received on Mon Dec 27 2010 - 09:58:25 CST

Original text of this message