Re: NLS_LANG

From: Laurenz Albe <invite_at_spam.to.invalid>
Date: 25 Jul 2008 07:00:30 GMT
Message-ID: <1216969226.549594@proxy.dienste.wien.at>


"Álvaro G. Vicario" <alvaroNOSPAMTHANKS_at_demogracia.com> wrote:

>>> Is there any way to get an unmodified output from a certain table field 
>>> that I know contains (or it's supposed to contain) an Euro symbol so I 
>>> can check with a binary editor what numeric code it's actually using?

You could use the DUMP() function to see what is actually stored in the database.

>> Earlier in this thread you specified the characterset of the database
>> is WE8ISO8859P1.
>> This characterset doesn't have the euro. 

>
> Despite that, it seems they did insert euros. In my web pages it
> displays as "¿" but through the context it's obvious it's supposed to be
> the euro symbol. So, no matter the actual charset, it seems their apps
> treat the data as MSWIN1252 or WE8ISO8859P15.
>
> I believe I could connect as WE8ISO8859P15 and let Oracle do the
> conversion but if actual data does not use the charset it's supposed to,
> the conversion will be meaningless. I've done further testing and I'd
> dare say data is MSWIN1252: the only way I can see the euro symbol is
> connecting as WE8ISO8859P1 and then doing a client-side conversion from
> cp1252 to iso-8859-15 with Iconv.

If you can get Euro signs from a LATIN-1 database, it is an indication that you are a victim of the problem I mentioned upthread, namely that Oracle doesn't check your input if client and server character set are identical.

You probably have the Euros stored as hex 0x80 if the client is a Windows machine (make sure with DUMP).

You will get the 0x80 back as long as client and server character set stay the same, but there is no other setting for the client character set that will deliver anything meaningful.

That is because you have bad data in your database, and they cannot be converted from LATIN-1 to anything. Essentially it is a case of user induced data corruption.

You can either
1) stick with LATIN-1 on both server and client and pray that nobody will

   ever enter characters in a different encoding and that you will never    have to access the database from a non-Windows system. 2) if you are sure that all bogus characters are actually WE8MSWIN1252,

   you could try to change the database character set.

Yours,
Laurenz Albe Received on Fri Jul 25 2008 - 02:00:30 CDT

Original text of this message