Oracle FAQ Your Portal to the Oracle Knowledge Grid

Home -> Community -> Mailing Lists -> Oracle-L -> Re: Possible characterset issues w/ datapump

Re: Possible characterset issues w/ datapump

From: Phil Singer <>
Date: Mon, 22 Jan 2007 20:28:46 -0500
Message-ID: <>

Don Seiler wrote:
> So it is the datapump doing the expdp that is mangling the values?
> For the record, here are the dump values before and after on this
> particular example, you can see the second value is changed from 250
> to 191.
> SOURCE: Typ=1 Len=11: 77,250,115,105,99,97,32,82,101,97,108
> DEST: Typ=1 Len=11: 77,191,115,105,99,97,32,82,101,97,108

Yes, the Decimal 191 (hex BF) is the symbol Oracle inserts when it is in US7ASCII mode and it doesn't know what to do with a character.

> I assumed that the US7ASCII on the source db was the problem, but I
> didn't understand why it would get changed once the value was stored
> in the DB. So the database would still store an extended character,
> even though its charset is US7ASCII?
> Don.
> On 1/21/07, Tony van Lingen <> wrote:

>> *Datapump would use the database character set I'd suppose, as it is
>> really the database writing out the file, just as with database links
>> and utl_file. One note explaining all this is Note:158577.1.
>> The root of the problem would be that your client inserts and retrieves
>> data using it's own character set, but datapump uses US7ASCII to
>> interpret the data and consequently the extended characters (which don't
>> exists in 7-bit ASCII) get mangled. I don't think there's a workaround,
>> except re-entering the extended characters after migration...

Assuming the SOURCE database is using the US7ASCII character set, one thing which could be done is to unload the data to a flat file, with whatever client does the unload having it's character set defined to be US7ASCII (regardless of whatever character set it really has). Since Oracle does not perform any character transformation when the character sets are the same, you can get the characters out of the database this way. Then you can put the data back in using the other character set.

Which seems close to what you are currently trying to do. Which is why I wonder if DataPump is mucking things up.

>> Cheers,
>> Tony

Phil Singer                         |   psinger1 at chartermi dot net
PhD, OCP, and All Around Good Guy | Do the Obvious to Reply
Received on Mon Jan 22 2007 - 19:28:46 CST

Original text of this message