Re: NLS functionality and replication between different character encoding schemes

Home -> Community -> Usenet -> c.d.o.server -> Re: NLS functionality and replication between different character encoding schemes

From: Yuri Korostyshevskiy <ykoros_at_dowjones.net>
Date: 1997/11/19
Message-ID: <347381F3.60893A35@dowjones.net>#1/1

I was facing a similar problem a few months ago, and had multiple discussions with
Oracle about this.
Here are several important points:
1. When two different character sets are used on the client and the server

     implicit conversion takes place. It works like this:
     If a character is present in both character sets, then its ASCII code in
one
      character set is converted to whatever it is in another character set.
      E.g. A is 65 in US7ASCII, and is 50 in some other character set. Then
       65 is being converted to 50, and back.

2. If a character is present in one character set and is absent in the other, then

it is replaced by a "replacement character". Usually a "?".

3. All the above functionality is hard-coded deep inside SQL*Net.

       So if you ask Oracle, you'll get the answer that you can only work
       with multiple character sets if your database character set is a
superset of all
        of them.

From the above I concluded that to support multiple languages, one should work around NLS.

The solution I implemented is based on the fact that SQL*Net wouldn't convert RAW
datatype. The problem there is that RAW datatype is not what you expect it to be.
(or at least it wasn't what I expected it to be :-) ) It actually takes only hex-encoded characters.
So to transfer data across SQL*Net without conversion, I had to store it in RAW columns.

The process goes like this. Clients input data in whatever language they use.
The data gets encoded into hex, transferred across SQL*Net, and stored as RAW.

Once I implemented that, I realized that, since most of the character sets contain
A-F and 0-9 anyway, SQL*Net wouldn't touch the hex-encoded data even if it is stored in VARCHAR2. (I prefer VARCHAR2 since unlike RAW, I can index it).

Optionally, you can decode data before storing it in the database. Then, to retrieve your
data, you should encode it into hex, send it over SQL*Net , decode it from hex, and then let
the font interpret it.

Oracle 8 takes two character sets: database character set and national character set.
It may be helpful. I looked at it only briefly, but due to the "superset" rule, I don't
believe it solves the problem of supporting multiple character sets.

Hope this helps
If interested in a more detailed discussion, please e-mail.

Yuri Korostyshevskiy / Oracle DBA - Consultant ykoros_at_dowjones.net

Melinda Schall wrote:

> I am interested in hearing from anyone who has implemented or has knowledge
> of Oracle's nls functionality and its limitations relating to replication.
> Specifically,
> I'm wondering if it's possible to replicate between two databases that are
> built with
> different Character Encoding Schemes (for example: WE8ISO8859P1 -- West
> European and
> JA16SJIS Japanese Shift-JIS).
>
> Even if this proves impossible (which I'm guessing it is) I'm interested in
> hearing about
> any solutions/ ideas regarding the global rollup of
> information from databases with different Character Encoding Schemes.
>
> What I want to figure out is how best to allow users of the database in
> each country
> to input most information in their own language; yet allow the sharing of
> some information
> (with the shared information being in English) between users in other
> countries through the
> use of replication.
>
> Or even if someone could point me somewhere with more information on
> Oracle's NLS
> functionality than the basic Oracle manuals provide. What I need more than
> anything
> is a listing/explanation of the limitations of nls.
>
> Also, I'm working in an Oracle 7.3.2 environment on NCR/UNIX.
> Thanks in advance for any help or discussion.
Received on Wed Nov 19 1997 - 00:00:00 CST