Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.server -> Re: Storage requirements of NCHAR columns
Ross wrote:
> Anurag Varma wrote:
> > Kenneth wrote:
> > > On 10 Jul 2006 09:59:49 -0700, "Ross" <rossfreemantle_at_yahoo.co.uk>
> > > wrote:
> > >
> > > >
> > > >"When you use NCHAR and NVARCHAR2 datatypes for storing multilingual
> > > >data, the
> > > >column size specified for a column is defined in number of characters.
> > > >(The number of
> > > >characters means the number of Unicode code units.)"
> > > >
> > > >This sould seem to suggest that the a NCHAR(30) column would actually
> > > >require 30 bytes of storage (as UTF8 has a single-byte code unit).
> > > >However, an example in Chapter 7 explicitly states that 90 bytes are
> > > >required. I don't think the NLS_LENGTH_SEMANTICS parameter affects
> > > >NCHAR, so which is correct?
> > > >
> > >
> > > UTF8 is not a single-byte charset. It is a varying-width charset.
> > >
> > > If you defines a NCHAR(30) column, it will contain 30 characters
> > > (unless NULL). Always. Period.
> > >
> > > How many *bytes* of storage does this column require ?
> > >
> > > That depends on which kind of characters you store in it. If you only
> > > store ASCII-characters, each of them will require one byte, because
> > > UTF8 is a superset of ASCII, ASCII being in the range 0-255.
> > >
> > > If you store characters beyond ASCII, between 2 and 4 bytes of storage
> > > is needed, depending on what position these characters have in the
> > > UNICODE range, which is 0 - (2^32-1).
> > >
> > > So a minimum of 30 bytes is required. And a maximum of 30*4 = 120
> > > bytes, that if all your 30 characters are in the upper 2^8 end of the
> > > UNICODE range.
> > >
> > > Conclusion : Between 30 and 120 bytes are required for a NCHAR(30)
> > > column with UTF8 as NCHARSET.
> > >
> > > - Kenneth Koenraadt
> >
> > UTF8 would require between 1 to 3 bytes. surrogate pairs would use 6
> > bytes in UTF8.
> > AL32UTF8 which is the "real" UTF8 would use between 1 to 4 bytes.
> >
> > When a nchar column of length N is defined (in al16UTF16). The maximum
> > bytes you can
> > enter in that column is 2 * N.
> > If NCHAR is of UTF8 type, then maximum bytes you can enter in that
> > column
> > is 3 * N.
> >
> > The hard limit of 4000 bytes still applies. So a 4000 CHAR column does
> > not necessarily
> > mean it will hold 4000 characters (in utf8 or al16utf16).
> >
> > Metalink Note: 144808.1 provides some info on this ...
> >
> > Anurag
>
? Are you interpreting it differently? nchar(30) WOULD contain 30
characters
even if they were multibyte. The actual byte length is adjusted to
ensure that.
Anurag Received on Tue Jul 11 2006 - 07:11:14 CDT
![]() |
![]() |