Re: Problems loading data into the database using SQL Loader.

From: trub3101 <trub3101_at_sky.com>
Date: Thu, 25 Jun 2009 05:52:35 -0700 (PDT)
Message-ID: <a1f544f7-41c6-45dc-95bd-d054966692eb_at_l28g2000vba.googlegroups.com>



On 25 June, 13:42, "Vladimir M. Zakharychev" <vladimir.zakharyc..._at_gmail.com> wrote:
> On Jun 25, 4:04 pm, trub3101 <trub3..._at_sky.com> wrote:
>
>
>
>
>
> > On 25 June, 08:14, "Vladimir M. Zakharychev"
>
> > <vladimir.zakharyc..._at_gmail.com> wrote:
> > > On Jun 24, 5:47 pm, trub3101 <trub3..._at_sky.com> wrote:
>
> > > > On 24 June, 07:55, "Vladimir M. Zakharychev"
>
> > > > <vladimir.zakharyc..._at_gmail.com> wrote:
> > > > > On Jun 23, 11:38 pm, trub3101 <trub3..._at_sky.com> wrote:
>
> > > > > > On 23 June, 20:30, trub3101 <trub3..._at_sky.com> wrote:
>
> > > > > > > On 23 June, 19:53, ddf <orat..._at_msn.com> wrote:
>
> > > > > > > > On Jun 23, 1:38 pm, trub3101 <trub3..._at_sky.com> wrote:
>
> > > > > > > > > On 23 June, 17:53, ddf <orat..._at_msn.com> wrote:
>
> > > > > > > > > > On Jun 23, 11:42 am, trub3101 <trub3..._at_sky.com> wrote:
>
> > > > > > > > > > > Hi all,
>
> > > > > > > > > > > Database
> > > > > > > > > > > Oracle 9.2.0.8
> > > > > > > > > > > NLS_LANG=AMERICAN_AMERICA.UTF8
> > > > > > > > > > > NLS_CHARACTERSET UTF8
> > > > > > > > > > > NLS_NCHAR_CHARACTERSET AL16UTF16
>
> > > > > > > > > > > Server
> > > > > > > > > > > Red Hat Linux 2.6.9-67.ELsmp
> > > > > > > > > > > LANG=en_US.UTF-8
>
> > > > > > > > > > > I cannot figure this one out. When I try to populate a table using SQl
> > > > > > > > > > > Loader I am getting corrupt characters in the table e.g. Alien³
> > > > > > > > > > > appears in the table as Alien¿
>
> > > > > > > > > > > I have even set the CHARACTERSET to UTF8 in the control file for good
> > > > > > > > > > > measure!
>
> > > > > > > > > > > Can anyone throw some light on this one?
>
> > > > > > > > > > > Thanks in advance for your replies.
>
> > > > > > > > > > > TB3101
>
> > > > > > > > > > I'll take a guess that you're doing this from a client machine.  What
> > > > > > > > > > is the NLS_LANG on the machine where you're running the loader?
>
> > > > > > > > > > David Fitzjarrell- Hide quoted text -
>
> > > > > > > > > > - Show quoted text -
>
> > > > > > > > > Thanks for reply David,
>
> > > > > > > > > I am actually logged onto the server and running the sql loader
> > > > > > > > > process. The NLS_LANG is set to AMERICAN_AMERICA.UTF8
>
> > > > > > > > > Cheers,
>
> > > > > > > > > TB3101- Hide quoted text -
>
> > > > > > > > > - Show quoted text -
>
> > > > > > > > We'll need to see your control file and data file.
>
> > > > > > > > David Fitzjarrell- Hide quoted text -
>
> > > > > > > > - Show quoted text -
>
> > > > > > > Hi David,
>
> > > > > > > Here is the control file:
>
> > > > > > > LOAD DATA
> > > > > > > CHARACTERSET UTF8
> > > > > > > INFILE '/opt/ora/oracle/oradata/data/xxxx.dat' "str '\n'"
> > > > > > > BADFILE '/opt/ora/oracle/oradata/xxxx.bad'
> > > > > > > DISCARDFILE '/opt/ora/oracle/oradata/xxxx.dis'
> > > > > > > TRUNCATE
> > > > > > > INTO TABLE film_film
> > > > > > > TRAILING NULLCOLS
>
> > > > > > > (
> > > > > > >   START_TIME                    DATE "DD/MM/YYYY,HH24:MI:SS,"
> > > > > > > TERMINATED BY '^',
> > > > > > >   CHANNEL_ID                    CHAR(300)      TERMINATED BY '^',
> > > > > > >   REGION_CODE                   CHAR(10)       TERMINATED BY '^',
> > > > > > >   TITLE                         CHAR(200)      TERMINATED BY '^',
> > > > > > >   PERFORMERS                    CHAR(4000)     TERMINATED BY '^',
> > > > > > >   DIRECTOR                      CHAR(150)       TERMINATED BY '^',
> > > > > > >   FILM_RATING                   INTEGER EXTERNAL       TERMINATED BY
> > > > > > > '^',
> > > > > > >   YEAR                          CHAR(15)       TERMINATED BY '^',
> > > > > > >   DESCRIPTION                   CHAR(4000)    TERMINATED BY '^',
> > > > > > >   PREMIERE                      CHAR(1)       TERMINATED BY '^',
> > > > > > >   TV_MOVIE                      CHAR(1)       TERMINATED BY '^',
> > > > > > >   FILM_CERTIFICATE              CHAR(10)      TERMINATED BY '^',
> > > > > > >   WARNING                       CHAR(200)     TERMINATED BY '^',
> > > > > > >   REVIEW_AUTHOR                 CHAR(50)
> > > > > > > )
>
> > > > > > > and here is the data file:
>
> > > > > > > 30/06/2009 19:00:00^252^^Alien³^Warrant Officer Ripley*Sigourney
> > > > > > > Weaver|Dillon*Charles S Dutton|Clemens*Charles Dance|Golic*Paul McGann|
> > > > > > > Superintendent Andrews*Brian Glover|Aaron*Ralph Brown (2)|Morse*Danny
> > > > > > > Webb|Rains*Christopher John Fields|Junior*Holt McCallany|Bishop
> > > > > > > II*Lance Henriksen|Murphy*Christopher Fairbank|David *Pete
> > > > > > > Postlethwaite|William*Clive Mantle^David Fincher^3^1992^Given that the
> > > > > > > first two films stand up as sci-fi classics in their own right, Se7en
> > > > > > > director David Fincher, in his feature film debut, had a virtually
> > > > > > > impossible act to follow with this second sequel. He makes a
> > > > > > > surprisingly good fist of it, developing the maternal themes of first
> > > > > > > sequel Aliens and providing an exhilarating final showdown. Sigourney
> > > > > > > Weaver returns as Ripley, who this time crash-lands on a prison colony
> > > > > > > where another lethal alien is let loose. A familiar cast of Brits
> > > > > > > (Charles Dance, Paul McGann, Brian Glover) provides the alien food
> > > > > > > and, while it isn't in the same class as the first two films, this
> > > > > > > provides a satisfactory entry in the series. Still, it would have been
> > > > > > > interesting to see what second-choice director Vincent Ward (of The
> > > > > > > Navigator: a Medieval Odyssey fame) would have made of it -
> > > > > > > apparently, he was brought in when Renny Harlin left after script
> > > > > > > disagreements, but was himself replaced when it emerged that his
> > > > > > > version of the movie would be set in a monastery and the alien itself
> > > > > > > wouldn't be appearing. ^N^N^18^Contains violence, swearing and nudity.
> > > > > > > ^JF
>
> > > > > > > Thanks again for your help,
>
> > > > > > > TB3101- Hide quoted text -
>
> > > > > > > - Show quoted text -
>
> > > > > > Hi David,
>
> > > > > > When I run 'cat -e' on the data file the output for the title is
> > > > > > different!
>
> > > > > > 30/06/2009 19:00:00^252^^AlienM-3^Warrant Officer Ripley*Sigourney
> > > > > > Weaver|Dillon*Charles S Dutton|Clemens*Charles Dance|Golic*Paul McGann|
> > > > > > Superintendent Andrews*Brian Glover|Aaron*Ralph Brown (2)|Morse*Danny
> > > > > > Webb|Rains*Christopher John Fields|Junior*Holt McCallany|Bishop
> > > > > > II*Lance Henriksen|Murphy*Christopher Fairbank|David *Pete
> > > > > > Postlethwaite|William*Clive Mantle^David Fincher^3^1992^Given that the
> > > > > > first two films stand up as sci-fi classics in their own right, Se7en
> > > > > > director David Fincher, in his feature film debut, had a virtually
> > > > > > impossible act to follow with this second sequel. He makes a
> > > > > > surprisingly good fist of it, developing the maternal themes of first
> > > > > > sequel Aliens and providing an exhilarating final showdown. Sigourney
> > > > > > Weaver returns as Ripley, who this time crash-lands on a prison colony
> > > > > > where another lethal alien is let loose. A familiar cast of Brits
> > > > > > (Charles Dance, Paul McGann, Brian Glover) provides the alien food
> > > > > > and, while it isn't in the same class as the first two films, this
> > > > > > provides a satisfactory entry in the series. Still, it would have been
> > > > > > interesting to see what second-choice director Vincent Ward (of The
> > > > > > Navigator: a Medieval Odyssey fame) would have made of it -
> > > > > > apparently, he was brought in when Renny Harlin left after script
> > > > > > disagreements, but was himself replaced when it emerged that his
> > > > > > version of the movie would be set in a monastery and the alien itself
> > > > > > wouldn't be appearing. ^N^N^18^Contains violence, swearing and nudity.
> > > > > > ^JF$
>
> > > > > > Cheers,
>
> > > > > > TB3101
>
> > > > > Can you also post the SQL*loader log file? Do you see something like
> > > > > this at the very beginning of the log:
>
> > > > > Character Set UTF8 specified for all input.
> > > > > First primary datafile xxxx.dat has a
> > > > >  utf8 byte order mark in it.
>
> > > > > I mean, you gotta make sure the file is really in UTF-8 so that
> > > > > SQL*Loader could process it properly.
>
> > > > > Regards,
> > > > >    Vladimir M. Zakharychev
> > > > >    N-Networks, makers of Dynamic PSP(tm)
> > > > >    http://www.dynamicpsp.com-Hidequotedtext-
>
> > > > > - Show quoted text -
>
> > > > Hi Vladimir,
>
> > > > The log file has 'Character Set UTF8 specified for all input.' in it.
>
> > > > Was 'First primary datafile xxxx.dat has a utf8 byte order mark in it'
> > > > an observation on your behalf or was this also supposed to be in the
> > > > log file?
>
> > > > Thanks for your reply,
>
> > > > TB3101
>
> > > I observed it while testing your case and I suppose it should be there
> > > for the loader to properly process/convert your data. In my test (on
> > > 10.2.0.4/Win32,) a file in UTF-8 with BOM at the very beginning of the
> > > file was loaded correctly. It is not mandatory to have the BOM in an
> > > UTF-8 file as the byte order in UTF-8 is the same on all platforms and
> > > BOM's only purpose in UTF-8 is to mark otherwise unmarked plain text
> > > as being in Unicode.
>
> > > Since your file is actually in ISO-8859-1, maybe it's easier to
> > > reflect this in the control file (CHARACTERSET WE8ISO8859P1) and let
> > > Oracle do the conversion to the database charset (it's pretty good at
> > > that.)
>
> > > Hth,
> > >    Vladimir M. Zakharychev
> > >    N-Networks, makers of Dynamic PSP(tm)
> > >    http://www.dynamicpsp.com-Hidequoted text -
>
> > > - Show quoted text -
>
> > Thanks for your reply Vladimir,
>
> > I had hoped that the solution would have been something as easy as
> > changing the characterset in the control file to WE9ISO8859P1 however,
> > this does not seem to make any difference what so ever.
>
> > Just to satisfy my curiousity which character is the BOM?
>
> > Thanks again,
>
> > TB3101
>
> U+FEFF (that is, Unicode code point 0xFEFF, also known as "zero width ...
>
> read more »- Hide quoted text -
>
> - Show quoted text -

Thanks Vladimir,

So that file was UTF-8 encode? Received on Thu Jun 25 2009 - 07:52:35 CDT

Original text of this message