Re: SQL*Loader can a comma delimited file be imported and skip selected fields?

From: Jeff Boehlert <boehlert_at_peak.org>
Date: Fri, 28 May 1999 01:56:44 -0700
Message-ID: <7ipnk8$13r$1_at_bashir.peak.org>


[Quoted] Thank you Tom. I was sure someone had come up with some clever way to do this. I am not working in a Unix environmnet so I have opted to try your PL/SQL approach. As mentioned my file is comma delimited, optionally enclosed with quotes. When I try my file with this approach the parsing appears to have a few problems. See details here:

LOAD DATA
INFILE *
REPLACE
INTO TABLE DELIMITED_TEST
(
  FIELD1 position(1:4096) "delimited.word(:field1,1,chr(34),chr(44))",   FIELD2 position(1:1) "delimited.word(:field1,4,chr(34),chr(44))" )
BEGINDATA

John Jones,"Elm St",Junk,123 Main Street       
Jeff Boehlert,abcdef," 5555","po box, 1848, abc"               


Results:



select * from delimited_test;
/
FIELD1                                                                                              

--------------------------------------------------
FIELD2
--------------------------------------------------
ohn Jone 23 Main Stree eff Boehler po box, 1848, abc

[Quoted] Fields having leading/trailing characters stripped??

Thanx for your help, once I have this tuned it will be a great help. It sure would be nice if loader allowed us to map an inbound column to the bit bucket!!

      _  __________________________________________  _
     / )| Jeff Boehl
ert        boehlert_at_peak.org   |( \
    / / |             Why run 26.2 ?               | \ \
  _( (_ |      Because it hurts so good !!         | _) )_
 (((\ \>|_/->__________________________________<-\_|</ /)))
 (\\\\ \_/ /    o      New York City      o     \ \_/ ////)
  \       /    < \      Marine Corp      < \     \       /
   \    _/    _/\        Portland       _/\       \_    /
   /   /       /          Big Sur        /          \   \
  /___/                   Seattle                    \___\
                       San Francisco
                    Rock-n-Roll Marathon
                    1999 Boston Marathon
 =================================================================
 Utica, NY home of the best 15k in the country - The Boilermaker
        And of course the Distance Running Hall of Fame
 =================================================================


Thomas Kyte wrote in message <37568dd0.5969483_at_newshost.us.oracle.com>...

>A copy of this was sent to "Jeff Boehlert" <boehlert_at_peak.org>
>(if that email address didn't require changing)
>On Thu, 27 May 1999 22:32:13 -0700, you wrote:
>
>>All:
>>
>>Is it possible to import a comma delimited file, optionally enclosed with
",
[Quoted] >>variable length fields and ignore select fields?
>>
>>Example file:
>>1,234.56,"123 Elm St",ABC,345
>>2,1000.45,"Main St",A,5
>>
>>Destination table includes only two columns ROW_ID and STREET, to those
>>columns I want to map fields 1 and 3. Fields 2,4 and 5 are to be ignored.
>>
>>Thank You,
[Quoted] >>Jeff Boehlert
>>jboehler_at_ctron.com
>
>
>
>sqlldr by itself cannot do this with delimited data until Oracle8i, release
8.1
>(added a keyword FILLER for this).
>
>If you are using UNIX, using "cut" on the file and piping the results of
this
>into a named pipe (and having sqlldr read that pipe) is pretty efficient. For
>example:
>
>$ mknod sqlldr.dat p
>$ cut -f3- -d, < test.dat > sqlldr.dat &
>[1] 4946
>
>$ sqlldr userid=tkyte/tkyte control=sqlldr.ctl data=sqlldr.dat
>
>would just let sqlldr have at columns 3 on in the file test.dat (separated
by
>commas)...
>
>Also, you can do it with pl/sql and sqlldr pretty easily.  Consider the
>following .ctl file:
>
>LOAD DATA
>INFILE *
>REPLACE
>INTO TABLE DELIMITED_TEST
>( 
>  FIELD1 position(1:4096) "delimited.word(:field1,1,NULL,chr(9))",
>  FIELD2 position(1:1)   "delimited.word(:field1,4,NULL,chr(9))"
>)
>BEGINDATA
>John Doe JunkField#1 MoreJunk 123 Main Street
>Mrs. ReallyLongLastName xxx short 345 River Road
>
>
>This will load the fields 1 and 4 from the above (separated by tabs).  The
>delimited package follows below.  Note the use of field1 in all of the calls to
>delimited.word.  This will load a file with lines upto 4k in length (change the
>4096 to bigger to allow for longer linesizes, upto 32k )The main routine, word,
>takes as inputs:
>
>the entire input line.
>What field from the line you want.
>What character MIGHT enclose fields (defaults to a single quote).
>What character separates fields (defaults to a comma).
>
>The word subroutine looks at the line, and if it is different from the one it
>already has parsed, breaks it up into 'words' based on the enclosure and
>separator.  
>
>If the line is the same (as it will be for a single insert of >1 column), it
>simply returns the array entry corresponding to that 'word'.
>
>Hope this helps
>
>
>
>create or replace package delimited
>as
> function word( p_str in varchar2, 
>    p_n   in varchar2, 
>    p_enclosed_by in varchar2 default '''', 
>    p_separated_by   in varchar2 default ',' )
> return varchar2;
>
> pragma restrict_references( word, WNDS, RNDS );
>end;
>/
>
>create or replace package body delimited
>as
> type vcArray is table of varchar2(2000) index by binary_integer;
>
> g_words vcArray;
> g_empty vcArray;
> g_last_string varchar2(4096);
>
> function de_quote( p_str in varchar2, p_enc_by in varchar2 ) return varchar2
> is
> l_enc_len number default nvl(length(p_enc_by),0);
> l_str_len number default nv

l(length(p_str),0);
> l_tmp_str varchar2(200) default
>   substr( p_str, l_enc_len+1, l_str_len-2*l_enc_len );
> begin
> return replace( l_tmp_str, p_enc_by||p_enc_by, p_enc_by );
> end de_quote;
>
>
> procedure parse( p_str in varchar2,
> p_delim in varchar2,
> p_sep   in varchar2 )
> is
> l_n number  default 1;
> l_in_quote boolean default FALSE;
> l_ch char(1);
> l_len number default nvl(length( p_str ),0);
> begin
> if ( l_len = 0 ) then
> return;
> end if;
>
> g_words := g_empty;
> g_words(1) := NULL;
>
> for i in 1 .. l_len loop
> l_ch := substr( p_str, i, 1 );
> if ( l_ch = p_delim ) then
> l_in_quote := NOT l_in_quote;
> end if;
> if ( l_ch = p_sep AND NOT l_in_quote ) then
> l_n := l_n + 1;
> g_words(l_n) := NULL;
> else
> g_words(l_n) := g_words(l_n)||l_ch;
> end if;
> end loop;
>
> for i in 1 .. l_n loop
> g_words(i) := de_quote( g_words(i), p_delim );
> end loop;
> end parse;
>
>
> function word( p_str in varchar2,
>    p_n   in varchar2,
>    p_enclosed_by in varchar2 default '''',
>    p_separated_by   in varchar2 default ',' ) return varchar2
> is
> begin
> if ( g_last_string is NULL or p_str <> g_last_string ) then
> g_last_string := p_str;
> parse( p_str, p_enclosed_by, p_separated_by );
> end if;
> return g_words( p_n );
> end;
>
>end delimited;
>/
>
>
>See http://www.oracle.com/ideveloper/ for my column 'Digging-in to
Oracle8i'...
>
>Thomas Kyte
>tkyte_at_us.oracle.com
>Oracle Service Industries
>Reston, VA   USA
>
>--
>Opinions are mine and do not necessarily reflect those of Oracle
Corporation Received on Fri May 28 1999 - 10:56:44 CEST

Original text of this message