Re: Dealing with UTF8 and loading data with SQL*Loader

From: Stefan Knecht <knecht.stefan_at_gmail.com>
Date: Tue, 13 Jun 2017 02:27:16 +0700
Message-ID: <CAP50yQ84-Y_m-+0eH1qzLxaso9vm_b3EC_79yVD547Eb1AY+nw_at_mail.gmail.com>



This kind of sounds like you can't use fixed positions with a file like this.

Have you tried pre-processing it with e.g. perl, and "convert" it to so that columns are separated by a specific character instead of relying on the positions? A perl one-or-two-liner should be able to do that? Perl may not suffer from the same "oddity" that you're seeing with sql*loader.

Just an idea

Stefan

On Tue, Jun 13, 2017 at 1:09 AM, Mark J. Bobak <mark_at_bobak.net> wrote:

> Hi All,
>
> I'm running Oracle 12.1.0.2 Standard Edition
>
> We have files in UTF8 that we are trying to load in SQL*Loader. I've got
> 'CHARACTERSET' in the control file set to 'UTF8'. The database is created
> with AL32UTF8.
>
> The file contains only standard English and some Spanish characters. The
> data is loading and characters are being interpreted correctly.
>
> My problem is with the fact that the file contains fixed field length
> data, and the control file is using 'POSITION(x:y)' parameters, to define
> the fields. For lines that only contain standard, single-byte characters,
> everything works fine. However, the POSITION parameter uses byte semantics
> (regardless of the value of the 'LENGTH SEMANTICS' parameter). So, if a
> particular line has a two-byte character in it,every field following the
> field with the two-byte character is off by one. And, of course, if there
> are multiple two-byte characters in a particular line, that line will be
> off by n, where n is the number of two-byte characters in that particular
> line.
>
> I'm no SQL*Loader expert.....am I missing something obvious here? Does
> anyone have any idea how to solve this?
>
> Thanks,
>
> -Mark
>

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Jun 12 2017 - 21:27:16 CEST

Original text of this message