Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: non-ascii characters causing prob in sql*loader

Re: non-ascii characters causing prob in sql*loader

From: cumin <jkilbourne_at_gmail.com>
Date: 14 Oct 2005 12:31:52 -0700
Message-ID: <1129318312.134289.308200@g49g2000cwa.googlegroups.com>

Malcolm Dew-Jones wrote:
> cumin (jkilbourne_at_gmail.com) wrote:
> : 23 rows of my textfile (out of 1,813,915) were flagged as errors in the
> : log file; same field, same error every time, and each has a non-ascii
> : character:
>
> : Record 22606: Rejected - Error on table SNOCONSOAB, column SRL.
> : ORA-01722: invalid number
>
> : My .ctl file looks like this:
>
> : options (direct=3Dtrue, errors=3D100)
> : load data
> : CHARACTERSET UTF8
> : infile 'snoconsoAB.dat' "str X'0d0a'"
> : badfile 'snoconsoAB.bad'
> : discardfile 'snoconsoAB.dsc'
> : truncate
> : into table snoconsoAB
> : fields terminated by '|'
> : trailing nullcols
> : (CUI char(8),
> : LAT char(3),
> : TS char(1),
> : LUI char(8),
> : STT char(3),
> : SUI char(8),
> : ISPREF char(1),
> : AUI char(8),
> : SAUI char(50),
> : SCUI char(50),
> : SDUI char(50),
> : SAB char(20),
> : TTY char(20),
> : CODE char(50),
> : STR char(3000),
> : SRL integer external,
> : SUPPRESS char(1),
> : CVF char(50)
> : )
>
> : Here are two examples from the .bad file; notice that in the field
> : right before the SRL field, where the error was flagged, there is a
> : non-ascii character, but the field with the error (SRL, third from the
> : right) *does* contain a number, contrary to the .log file indication.
> : The last field is null:
>
> : C0005122|ENG|S|L3542581|VO|S4062201|Y|A4394823||||ICPC2ICD10ENG|PT|MTHU0407=
> : 80|Kakk=E8|3|N|
> ^^^^^^^
>
> I'm not clear what data is actually in this field. I also notice a
> questionable =80 before that in another field.
>
> Is that what is really in the data, or are the =80 and the =E8 the
> newsreader's rendering of non-ascii bytes?
>
> If so I suspect those bytes are the problem. The log file below says
> "Character Set UTF8 specified for all input", BUT those bytes would mess
> up the utf-8 decoding of the input.
>
> I can't say for sure since I can't dump the data to examine the bytes
> myself, so it's just a guess at this stage.
>
> You could either specify the correct character set for the data, or encode
> the data to be valid utf-8.
>

I loaded the .bad file successfully by changing direct=true, but keeping CHARACTER SET UTF8 in the .ctl file.

I don't know why a direct load would cough on those characters, which were not as they appeared in my post; the hex value is ooe8 for the char showing as "=E8", according to vim.

Thanks for the help. Received on Fri Oct 14 2005 - 14:31:52 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US