Re: WE8ISO8859P1 convert to AL32UTF8 unicode character set question

From: Laurenz Albe <invite_at_spam.to.invalid>
Date: Thu, 9 Apr 2009 11:38:24 +0200
Message-ID: <1239269926.605400_at_proxy.dienste.wien.at>



lsllcm wrote:
> I use convert function is to test becuase the result of convert is the
> same as importing the same string from a WE8ISO8859P1 database to a
> AL32UTF8 database. (From oracle document)

Ah, good to know.

> If the varchar2 can be converted to target characterset and can be
> converted back too. That means that the varchar2 can be converted
> successfully.
>
> But as you said that when the char does not exist in db character set.
> It is one problem. It can cause unpredictable conversion.

Right. Not "unpredictable", but the wrong data that are in your database now will remain wrong.

> Fortunately, we use jdbc program to operate on oracle db. Java is
> UTF-16 encoded, Data retrieved from or inserted into the database must
> be converted from UTF-16 to the database character set or the national
> character set and vice versa. (From oracle document)

Java does not use UTF-16, but UCS-2.
However, this is probably not important in your case.

> Each character is converted from UTF-16 to WE8ISO8859P1. It also can
> be converted from WE8ISO8859P1 to UTF-16. Because UTF-16 can be
> converted to AL32UTF8, so all characters can be converted to AL32UTF8
> characterset.
>
> What do you think about?

I don't get it. What do you plan to do?

> I will test the way you metioned.
>
> 1. create one db with characterset WE8MSWIN1252, and insert data, and
> convert it.
> 2. create one db with characterset WE8ISO8859P1, and insert data, and
> alter database characterset to WE8MSWIN1252. Then convert it.

You must also set the correct client character set with NLS_LANG. If the client character set is WE8ISO8859P1 and the database is WE8MSWIN1252, you will not be able to insert character hex 85 (question mark syndrome). Also if the client character set is WE8MSWIN1252 and the database is WE8ISO8859P1, it will not work.
If both client and database character set are WE8ISO8859P1 (as it is now), you can insert hex 85 but it is wrong. Only if both client and database character set are WE8MSWIN1252, the inserted hex 85 will be the correct character.

Yours,
Laurenz Albe Received on Thu Apr 09 2009 - 04:38:24 CDT

Original text of this message