Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: what characterset to use?

Re: what characterset to use?

From: Laurenz Albe <invite_at_spam.to.invalid>
Date: 23 Aug 2007 10:21:56 GMT
Message-ID: <1187864513.609873@proxy.dienste.wien.at>


Ben <balvey_at_comcast.net> wrote:
> 10.2.0.2 Ent Ed, AIX5L
>
> We are planning a new database and I'm having a little confusion on
> what character set to use.
> I don't want to use multibyte data unless
> it is necessary but I do want to have the option available.
>
> 1. Am I correct in thinking that if we use NLS_CHARACTERSET =
> WE8MSWIN1252 and NLS_NCHAR_CHARACTERSET = AL16UTF16 we shouldn't have
> any issues with losing multibyte characters?

Most of your questions have already been answered correctly, but I would recommend that you create the database with AL32UTF8, just like Oracle recommends.

This way you don't have to think which columns should be NCHAR and which should be CHAR, because you can store any character in any text column.

You also lose nothing if you use AL32UTF8, because ASCII characters (Latin letters, numbers, the common special characters) will be stored as single bytes (in AL16UTF16 they would use up two bytes).

> 2. On my windows client I go to a command prompt and issue the
> following c:\> echo %NLS_LANG%,
> I get %NLS_LANG% returned back to me. I'm assuming that this means I
> don't have the variable set. How does my client then treat the data
> that I am getting from the database?

As has been mentioned, NLS_LANG is probably set to WE8MSWIN1252 in the Windows registry. You can override this setting with the environment variable. If neither is set, US7ASCII is assumed.

The data will be converted to the character set specified in NLS_LANG, so that value must always be set to the character set that your application or operating system expect.

There are two things that can happen when you have a bad NLS_LANG setting:

  1. Some or all characters are displayed as either garbage or question marks. This is rather harmless because you can find and correct it easily. Unfortunately Oracle has made the design choice to return garbage instead of throwing an error message when characters cannot be converted.
  2. If you (as misguided admins too often do) mistakenly set NLS_LANG to the database character set, no character conversion AND NO INTEGRITY CHECKS are performed and you can store all sort of garbage in your database without even noticing. It will cause problems later on, though. This is an Oracle bug in my opinion, although Oracle will probably disagree with me on this.

Again, when you use AL32UTF8, the danger of running into problem 2) is smaller because on no platform this is the "default" client character set.

Yours,
Laurenz Albe Received on Thu Aug 23 2007 - 05:21:56 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US