Re: select '§' from dual; produces ORA-01756 error

From: Chris O <itoys1>
Date: Fri, 13 Feb 2004 09:21:48 +1000
Message-ID: <402c0a8d$0$29132$afc38c87@news.optusnet.com.au>

"David McMeans" <googlenews_at_fhg.mailshell.com> wrote in message news:d3865bef.0402121053.2f5f93fa_at_posting.google.com...
> Chris,
>
> Thank for the pointer. Turned out that I didn't need to update my
> server's character set afterall. I needed to make my Oracle client's
> character set match my server's.
>
> Registry:
>
> HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\HOME1
> NLS_LANG = AMERICAN_AMERICA.WE8ISO8859P1
>
> David

Hi again David. It seems like you have fixed the problem but from this end I'm not sure whether you have or just masked it. Character set stuff is pretty confusing.

Just be aware that when you use the same character set on your client and on your server then Oracle DOES NOT PERFORM ANY FORM OF TRANSLATION OR VALIDATION. This is efficient and OK providing you pass valid characters to Oracle. However, it will cause you to come unstuck if your client program passes garbage to Oracle. SQL*Plus for example always assumes the input file is in the operating system's character set and will just pass the characters directly to the Oracle net client. This is not much of a problem with single-byte character sets such as WE8ISO8859P1 but if you start accessing an Oracle server using UTF8 -or- AL32UTF8 then you had better be really careful if you set your client to use the same character set.

For example.

Server Character Set: UTF8
Client Character Set: UTF8
Encoding of input file to SQL*Plus: Western European (Latin-1)

Outcome: Fine until your file contains any character with a code point greater than 0x7F (decimal 127) as codepoints in the range 0x80 - 0xBF is not legal as the first byte of a UTF-8 sequence. Oracle will store the byte sequence "as is". Oracle will even return the byte sequence "as is" providing you have the same character set on your client as the server. However, any client with a different character set is going to force Oracle to perform character set conversion where upon it finds the sequence is invalid and will quietly discard the offending bytes. Result: corruption of your data.

So although it is very efficient [and thus tempting] to use the same character set on the client as on the server, before doing so I believe that you ought to understand the consequences / limitations. All of this is not too much of a problem with the single-byte character sets but many organizations are moving to Unicode databases (i.e. database character set of UTF8 or AL32UTF8).

I hope this is not too confusing.

Cheers Chris Received on Thu Feb 12 2004 - 17:21:48 CST