Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Storage requirements of NCHAR columns

Re: Storage requirements of NCHAR columns

From: Ross <rossfreemantle_at_yahoo.co.uk>
Date: 11 Jul 2006 02:48:51 -0700
Message-ID: <1152611331.144641.108020@35g2000cwc.googlegroups.com>

Anurag Varma wrote:
> Kenneth wrote:
> > On 10 Jul 2006 09:59:49 -0700, "Ross" <rossfreemantle_at_yahoo.co.uk>
> > wrote:
> >
> > >
> > >"When you use NCHAR and NVARCHAR2 datatypes for storing multilingual
> > >data, the
> > >column size specified for a column is defined in number of characters.
> > >(The number of
> > >characters means the number of Unicode code units.)"
> > >
> > >This sould seem to suggest that the a NCHAR(30) column would actually
> > >require 30 bytes of storage (as UTF8 has a single-byte code unit).
> > >However, an example in Chapter 7 explicitly states that 90 bytes are
> > >required. I don't think the NLS_LENGTH_SEMANTICS parameter affects
> > >NCHAR, so which is correct?
> > >
> >
> > UTF8 is not a single-byte charset. It is a varying-width charset.
> >
> > If you defines a NCHAR(30) column, it will contain 30 characters
> > (unless NULL). Always. Period.
> >
> > How many *bytes* of storage does this column require ?
> >
> > That depends on which kind of characters you store in it. If you only
> > store ASCII-characters, each of them will require one byte, because
> > UTF8 is a superset of ASCII, ASCII being in the range 0-255.
> >
> > If you store characters beyond ASCII, between 2 and 4 bytes of storage
> > is needed, depending on what position these characters have in the
> > UNICODE range, which is 0 - (2^32-1).
> >
> > So a minimum of 30 bytes is required. And a maximum of 30*4 = 120
> > bytes, that if all your 30 characters are in the upper 2^8 end of the
> > UNICODE range.
> >
> > Conclusion : Between 30 and 120 bytes are required for a NCHAR(30)
> > column with UTF8 as NCHARSET.
> >
> > - Kenneth Koenraadt

>

> UTF8 would require between 1 to 3 bytes. surrogate pairs would use 6
> bytes in UTF8.
> AL32UTF8 which is the "real" UTF8 would use between 1 to 4 bytes.
>

> When a nchar column of length N is defined (in al16UTF16). The maximum
> bytes you can
> enter in that column is 2 * N.
> If NCHAR is of UTF8 type, then maximum bytes you can enter in that
> column
> is 3 * N.
>

> The hard limit of 4000 bytes still applies. So a 4000 CHAR column does
> not necessarily
> mean it will hold 4000 characters (in utf8 or al16utf16).
>

> Metalink Note: 144808.1 provides some info on this ...
>
> Anurag

This would seem to tally with what is suggested in the docs. If it's correct, then an NCHAR(30) column could only contain 30 characters if they were all single-byte. This, however, contradicts the earlier posts that suggest it could contain 30 characters regardless of their individual encoding. Received on Tue Jul 11 2006 - 04:48:51 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US