Re: Loading UTF8 fixed length file using SQL Loader 8i

From: Anurag Varma <avdbi_at_hotmail.com>
Date: Thu, 06 Jan 2005 00:35:48 GMT
Message-ID: <EJ%Cd.128471$AL5.125515_at_twister.nyroc.rr.com>


"Nicolas Mayoraz" <nm01_at_dkweb.ch> wrote in message news:1f9a8c6.0501040518.6f8ab56d_at_posting.google.com...
> Hi all,
>
> I'm trying to load an UTF8 file with fixed length record using SQL
> Loader for Oracle 8i.
>
> Unfortunately, SQL Loader does not make the difference between bytes
> and characters. i.e. that if two bytes are used to code an accentuated
> character (like é) SQL loader will consider that two caracters has
> been read meaning that beginning of next field will be shifted.
>
> Example:
>
> LOAD DATA
> CHARACTERSET UTF8
> INFILE 'XXX.txt'
> REPLACE
>
> INTO TABLE tmp.MY_TABLE
> (
> LASTN POSITION (1:20), -- Last name
> FIRSN POSITION (21:40), -- First name
> STATU POSITION (41:41), -- Status
> BIRDT POSITION (42:49) -- Date of birth
> )
>
> With this data :
> Thévenat Jacques 119460508
>
> I'll get in my DB :
>
> Field 1 : Thévenat
> Field 2 : Jacques (note the leading space)
> Field 3 : (blank)
> Field 4 : 11946050 (wrong date)
>
> This because the "é" is coded like C3 A9 (= 2 bytes) in UTF8.
>
> I know that there is a LENGTH SEMANTICS CHAR option in Oracle 9i but
> I haven't found anything similar in 8i.
>
> Does anybody know a work around ?
>
> Thanks in advance.

What happens if you define the ctl file as?:

LOAD DATA
CHARACTERSET UTF8
 INFILE 'XXX.txt'
REPLACE  INTO TABLE tmp.MY_TABLE
 (

[Quoted]      LASTN    char(20),  -- Last name
     FIRSN    char(20), -- First name
     STATU    char(1), -- Status
     BIRDT    char(8)  -- Date of birth

)

?

Anurag Received on Thu Jan 06 2005 - 01:35:48 CET

Original text of this message