Re: SQL Loader: Comma delimited field skipping
Date: 1996/02/06
Message-ID: <4f8lkr$qbf_at_inet-nntp-gw-1.us.oracle.com>#1/1
Here is how I do it, I use pl/sql in sql. Install the following package into your schema:
create or replace package delimited
as
function word( str in varchar2, n in varchar2 ) return varchar2; pragma restrict_references( word, WNDS, RNDS );end;
/
create or replace package body delimited as
type vcArray is table of varchar2(2000) index by binary_integer;
words vcArray; last_string varchar2(4096); procedure parse is temp varchar2(4096) default last_string || ','; n number; begin for i in 1 .. 10000 loop exit when ( temp is NULL ); n := instr( temp, ',' ); words(i) := rtrim( ltrim( substr( temp, 1, n-1 ), '"' ), '"' ); temp := substr( temp, n+1 ); end loop; end parse; function word( str in varchar2, n in varchar2 ) return varchar2 is begin if ( last_string is NULL or str <> last_string ) then last_string := str; parse; end if; return words( n ); end;
end delimited;
/
This will create a new sql function that allows you to send comma delimited fields and extract the i'th field. For example, once this is installed you can:
SQL> select delimited.word( 'a,b,c,d,e,f', 3 ) from dual;
DELIMITED.WORD('A,B,C,D,E,F',3)
c
SQL> It is somewhat optimized in that subsequent calls for the same string will avoid the parse will not be done and a simple array index is all that is performed.
After you have that package, you can then write a control file like the following:
LOAD DATA
INFILE *
REPLACE
INTO TABLE DELIMITED_TEST
(
FIELD1 position(1:4096) "delimited.word(:field1,1)", FIELD3 position(1:4096) "delimited.word(:field3,3)", FIELD5 position(1:4096) "delimited.word(:field5,5)")
BEGINDATA
"Field1...","Field2...","Field3...","Field4...","Field5...","Field6..." "xField1...","xField2...","xField3...","xField4...","xField5...","xField6..." "yField1...","yField2...","yField3...","yField4...","yField5...","yField6..."
Basically, you will send the entire input line down to delimited.word (hence all fields are position(1:4096) or whatever you max line length is) and ask for the i'th column back from it. So now you can load field 1, 3, 5 or 5, 3, 1 or whatever. If you don't like the fact that it parses comma's, now you have the code and you can 'fix' it.
John L Dunn <misioror_at_gas.uug.arizona.edu> wrote:
>On Tue, 6 Feb 1996, Doug Harris wrote:
>> > We don't want to import B5--B10. How do we filter out the unwanted
>> > columns and still import B1--B4?
>>
>> This is easily done with awk and a named pipe (if you are running unix
>> ...please include OS and other relevant info such as versions with
>> specific technical questions like this).
>> --
>Whoops!! Sorry...
>We're running Windows NT Server 3.51, Oracle Workgroup Server V7.1.3
>John
Thomas Kyte
tkyte_at_us.oracle.com
Oracle Government
Received on Tue Feb 06 1996 - 00:00:00 CET