Re: Euro Character lost on export/import

From: Sybrand Bakker <sybrandb_at_hccnet.nl>
Date: Tue, 10 Aug 2004 19:33:37 +0200
Message-ID: <db1ih0p2j8c1pfpko18l0mvemp9palhj4r@4ax.com>

On Tue, 10 Aug 2004 14:29:05 +0200, Tosta <tosta_at_wtal.de> wrote:

>Hi.
>
>An Euro symbol character in varchar2 fields gets lost on export and import beteen different databases.
>
>Source: Oracle 8i, NLS_CHARACTERSET WE8ISO8859P1
>Target: Oracle 9i, NLS_CHARACTERSET AL32UTF8 (Unicode)
>
>We're running a Java-based web app with the 8iDB and migrating to 9i. There are definitely fields with the Euro sym in
>the 8iDB. They are displayed in the web app. After migration, the Euro sym has changed to a question mark.
>
>I would simply like to know if I'm right with the following:
>
>WE8ISO8859P1 char set doesn't support the Euro (WE8ISO8859P15 does), and Oracle "invents" its own code to represent the
>Euro sym. That's the reason the 8i app can display it. On import, all texts are converted from WE8ISO8859P1 to AL32UTF8.
>Since Euro is not in WE8ISO8859P1, there is no mapping for the oracle-invented Euro char code, as no conversion.
>
>Conclusion: There is nothing we can do as to live with it. If we had chosen WE8ISO8859P15 for the 8iDB in the beginning,
>we wouldn't have the trouble today.
>
>Right?
>
>Looking forward to your comments,
>
>Tosta.
>
>P.S. The other problems migrating to unicode, namely the length-semantics problem, have been solved already, thanks.

The story is slightly different. As the ISO couldn't agree on a location for the Euro in WE8ISO8859P1, they decided to set up a new characterset including the Euro, WE8ISO8859P15. As usual, Mickeysoft decided not to follow that path. Their default characterset is currently the 1252 code page, and the P1 characterset matches the 850 codepage.
The characterset for the database should *always* match the characterset of the O/S. The characterset of the database should have been set to WE8MSWIN1252. The one and only difference between the two charactersets is the location of the Euro. If you had chosen WE8MSWIN1252 from the beginning you wouldn't have had this problem.
It currently works because apparently the characterset of the database is identical to the characterset of the webserver, ie both 8 bit charactersets, so no conversion ever. As soon as you start export import, you are in trouble.
There are many notes on Metalink explaining in more detail how to check for these issues and how to resolve them.

--
Sybrand Bakker, Senior Oracle DBA

Received on Tue Aug 10 2004 - 12:33:37 CDT