Loading UTF8 fixed length file using SQL Loader 8i

From: Nicolas Mayoraz <nm01_at_dkweb.ch>
Date: 4 Jan 2005 05:18:23 -0800
Message-ID: <1f9a8c6.0501040518.6f8ab56d_at_posting.google.com>



Hi all,

[Quoted] [Quoted] I'm trying to load an UTF8 file with fixed length record using SQL Loader for Oracle 8i.

[Quoted] [Quoted] Unfortunately, SQL Loader does not make the difference between bytes and characters. i.e. that if two bytes are used to code an accentuated character (like é) SQL loader will consider that two caracters has been read meaning that beginning of next field will be shifted.

Example:

LOAD DATA
CHARACTERSET UTF8
[Quoted] INFILE 'XXX.txt'
REPLACE INTO TABLE tmp.MY_TABLE
(

[Quoted]     LASTN    POSITION (1:20),  -- Last name
    FIRSN    POSITION (21:40), -- First name
    STATU    POSITION (41:41), -- Status
    BIRDT    POSITION (42:49)  -- Date of birth
)

With this data :
Thévenat Jacques 119460508

[Quoted] I'll get in my DB :

[Quoted] Field 1 : Thévenat
Field 2 :  Jacques   (note the leading space)
Field 3 :    (blank)
Field 4 : 11946050 (wrong date)

[Quoted] This because the "é" is coded like C3 A9 (= 2 bytes) in UTF8.

[Quoted] I know that there is a LENGTH SEMANTICS CHAR option in Oracle 9i but I haven't found anything similar in 8i.

[Quoted] Does anybody know a work around ?

Thanks in advance. Received on Tue Jan 04 2005 - 14:18:23 CET

Original text of this message