Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: How to convert a database from ZHT16MSWIN950 to UTF8 or AL32UTF8

Re: How to convert a database from ZHT16MSWIN950 to UTF8 or AL32UTF8

From: Andy Hassall <andy_at_andyh.co.uk>
Date: Sun, 26 Sep 2004 14:59:59 +0100
Message-ID: <2aidl01c3004o1rnfu6be5ultug87a4l4r@4ax.com>


On Sun, 26 Sep 2004 16:19:24 +0800, "Nogates\(門都沒有\)" <william_h_at_consultant.com> wrote:

> I tried to use NLS_LANG=american_america.ZHT16MSWIN950 to run
> export. While doing import, using ZHT16MSWIN950 will generate no
> error, but the data won't be correctly showed in AL32UTF8.
>
> If I use "AL32UTF8" to import to the "AL32UTF8" database, I got these
> messages
> IMP-00019: row rejected due to ORACLE error 1401
> IMP-00003: ORACLE error 1401 encountered
> ORA-01401: inserted value too large for column

 Since the UTF8 encoding of a character may be longer than the ZHT16MSWIN950 encoding of a character, and VARCHAR2 sizes are (by default) specified in bytes, not characters, this is quite likely to happen.

 Can't say I'm familiar with ZHT16MSWIN950, but from the prefix I take it is a 2-byte encoding for Chinese characters? Chinese characters are typically 3 byte encodings in UTF8 (but can go up to 4 bytes).

 Various approaches to this; look up byte vs. character semantics for VARCHAR2 columns (so you can specify character length, not encoding byte length). Or increase the size of your VARCHAR2 columns to fit the longest UTF8 encoded size of your strings. If ZHT16MSWIN950 is a fixed-width double-byte encoding then doubling the size would be a safe bet, although possibly excessive.

-- 
Andy Hassall / <andy@andyh.co.uk> / <http://www.andyh.co.uk>
<http://www.andyhsoftware.co.uk/space> Space: disk usage analysis tool
Received on Sun Sep 26 2004 - 08:59:59 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US