Re: bytes vs chars

From: Hans Forbrich <fuzzy.graybeard_at_gmail.com>
Date: Fri, 11 Mar 2016 10:14:27 -0700
Message-ID: <56E2FCF3.6070100_at_gmail.com>



No. You are interpreting that parameter incorrectly.

http://docs.oracle.com/database/121/REFRN/GUID-221B0A5E-A17A-4CBC-8309-3A79508466F9.htm#REFRN10124

http://docs.oracle.com/database/121/NLSPG/ch2charset.htm#NLSPG170

Length semantics = Byte says "when I define a column as "col-name VARCHAR2(x)", the x is taken in bytes, regardless of character set. So "MyCOL VARCHAR2(20)" in byte semantics could be five to 20 characters (maximum) of storage allocated. If each character was unicode and used 4 bytes, all you could store is 5 characters.

You still want to find out the actual character set: http://docs.oracle.com/database/121/REFRN/GUID-3BCC0324-8FEC-409F-8472-74A72FDE310F.htm#REFRN30159

/Hans

On 11/03/2016 9:55 AM, Zelli, Brian wrote:
> So if nls_length_semantics is set to byte, can I assume the 1 char = 1 byte rule?
>
>
> Brian
>
>
>
> -----Original Message-----
> From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Hans Forbrich
> Sent: Friday, March 11, 2016 11:48 AM
> To: oracle-l_at_freelists.org
> Subject: Re: bytes vs chars
>
> I think the sentiment is correct, but there is a minor correction to the
> wording:
>
> Unicode is an attempt to get all different character sets into one
> superset, and is multi-byte in nature. The AL32UTF8 encoding for
> Unicode allows a character to be represented in the fewest required of 1, 2, 3 or 4 bytes, based on the Quick Link 'Code Charts' at http://unicode.org/
>
> The 1 character = 1 byte group are often known as 'single byte character sets' or 'single byte encoding'. These include ASCII and various ISO
> 8859 sets. A handy reference is at
> http://docs.oracle.com/database/121/NLSPG/ch2charset.htm#NLSPG166
>
> Therefore, I think the statement should be corrected to
>
> "If you're using a single-byte characterset then 1character = 1 byte.
> But if you're using a multibyte Unicode characterset then a character can be coded on several bytes."
>
> /Hans
>
> On 11/03/2016 8:50 AM, Ahmed Aangour wrote:
>> Hi,
>>
>> If you're using a unicode characterset then 1character = 1 byte. But
>> if you're using a multibyte characterset then a character can be coded
>> on several bytes.
>> You can check the character set of the database by querying
>> nls_database_parameters.
>>
>>
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>
>
> This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Mar 11 2016 - 18:14:27 CET

Original text of this message