Re: Loading UTF8 fixed length file using SQL Loader 8i

From: Malcolm Dew-Jones <yf110_at_vtn1.victoria.tc.ca>
Date: 4 Jan 2005 10:35:59 -0800
Message-ID: <41dae20f_at_news.victoria.tc.ca>


[Quoted] Nicolas Mayoraz (nm01_at_dkweb.ch) wrote:
: Hi all,

: I'm trying to load an UTF8 file with fixed length record using SQL
: Loader for Oracle 8i.

But utf-8 isn't fixed length.

: Unfortunately, SQL Loader does not make the difference between bytes
: and characters. i.e. that if two bytes are used to code an accentuated
: character (like é) SQL loader will consider that two caracters has
: been read meaning that beginning of next field will be shifted.

: Example:

: LOAD DATA
: CHARACTERSET UTF8
: INFILE 'XXX.txt'
: REPLACE

: INTO TABLE tmp.MY_TABLE
: (
: LASTN POSITION (1:20), -- Last name
: FIRSN POSITION (21:40), -- First name
: STATU POSITION (41:41), -- Status
: BIRDT POSITION (42:49) -- Date of birth
: )

: With this data :
: Thévenat Jacques 119460508

: I'll get in my DB :

: Field 1 : Thévenat
: Field 2 : Jacques (note the leading space)
: Field 3 : (blank)
: Field 4 : 11946050 (wrong date)

: This because the "é" is coded like C3 A9 (= 2 bytes) in UTF8.

As a said, not fixed length data.

: I know that there is a LENGTH SEMANTICS CHAR option in Oracle 9i but
: I haven't found anything similar in 8i.

: Does anybody know a work around ?

[Quoted] One possibility...

Make the fields in the load file long enough to contain the maximum length [Quoted] of utf-8 data for that field, and then pad the bytes of each field to be [Quoted] that length when you create the load file, and then trim off the padding [Quoted] as the field is inserted into the table.

--

This space not for rent.
Received on Tue Jan 04 2005 - 19:35:59 CET

Original text of this message