Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> Re: Possible characterset issues w/ datapump

Re: Possible characterset issues w/ datapump

From: Phil Singer <psinger1_at_chartermi.net>
Date: Fri, 19 Jan 2007 22:09:54 -0500
Message-ID: <45B2D976.8060106@chartermi.net>


Don Seiler wrote:
> We're having an issue where a "ú" character in a varchar2 is being
> corrupted after being funnelled through datapump. I'm reluctant to
> blame anything right now. Here is some background on the two
> machines, SOURCE and DEST. Both machines are on RHEL 3.
>
> SOURCE:
> Oracle 10.2.0.2
> OS $LANG = en_US
> OS $NLS_LANG = AMERICAN
> NLS_CHARACTERSET = US7ASCII
>
> DEST:
> Oracle 10.2.0.1
> OS $LANG = en_US.UTF-8
> OS $NLS_LANG = AMERICAN
> NLS_CHARACTERSET = WE8ISO8859P1
>
>
> We use expdp to dump a file from SOURCE, then FTP that file over the
> DES, then impdp that file into DEST. The developers say that the "ú"
> gets corrupted somewhere in that process, and that the hex value of
> the field is different than on SOURCE. When I said earlier that it
> looked fine on DEST, it was after they ran an UPDATE zap to correct
> the data. I'm waiting for another trial to see the bad data for
> myself, so bear with me.
>
> Given that the DEST characterset is actually a superset of SOURCE, I
> wouldn't expect any corruption or dataloss. Is there anything else I
> can look at or specify to track down when and where any corruption is
> taking place?
> --

Three things immediately pop out:

  1. Your SOURCE characterset is US7ASCII. Your problem character does not exist in that characterset. As the DEST characterset is different, a characterset conversion is definitely expected.
  2. Normally, it is the _client_ which determines the characterset to use (not the database instance). Data pump complicates matters.
  3. If all else fails, investigate the ftp.

But if I had a dollar for each character ruined when someone did a client install and forgot to change the client's character set from US7ASCII to WE8ISO8859P1 I wouldn't be dreading the upcoming layoffs at Ford. I really think the solution is somewhere along those lines.

Somewhere on MetaLink there is a note which actually explains how Oracle manages character sets, but I forget just which note it is.

-- 
Phil Singer                         |   psinger1 at chartermi dot net
PhD, OCP, and All Around Good Guy   |   Do the Obvious to Reply
--
http://www.freelists.org/webpage/oracle-l
Received on Fri Jan 19 2007 - 21:09:54 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US