Re: SQL Loader: Comma delimited field skipping

From: Thomas J Kyte <tkyte_at_us.oracle.com>
Date: 1996/02/06
Message-ID: <4f8lkr$qbf_at_inet-nntp-gw-1.us.oracle.com>#1/1


Here is how I do it, I use pl/sql in sql. Install the following package into your schema:

create or replace package delimited
as

	function word( str in varchar2, n in varchar2 )
	return varchar2;

	pragma restrict_references( word, WNDS, RNDS );
end;
/

create or replace package body delimited as

        type vcArray is table of varchar2(2000) index by binary_integer;

	words		vcArray;
	last_string	varchar2(4096);

	procedure parse
	is
		temp	varchar2(4096) default last_string || ',';	
		n		number;
	begin
		for i in 1 .. 10000 loop
			exit when ( temp is NULL );
			n := instr( temp, ',' );
			words(i) := rtrim( ltrim( substr( temp, 1, n-1 ), '"' ), '"' );
			temp := substr( temp, n+1 );
		end loop;
	end parse;

	function word( str in varchar2, n in varchar2 ) return varchar2
	is
	begin
		if ( last_string is NULL or str <> last_string ) then
			last_string := str;
			parse;
		end if;
		return words( n );
	end;

end delimited;
/

This will create a new sql function that allows you to send comma delimited fields and extract the i'th field. For example, once this is installed you can:

SQL> select delimited.word( 'a,b,c,d,e,f', 3 ) from dual;

DELIMITED.WORD('A,B,C,D,E,F',3)



c

SQL> It is somewhat optimized in that subsequent calls for the same string will avoid the parse will not be done and a simple array index is all that is performed.

After you have that package, you can then write a control file like the following:

LOAD DATA
INFILE *
REPLACE
INTO TABLE DELIMITED_TEST
(

  FIELD1 position(1:4096) "delimited.word(:field1,1)", 
  FIELD3 position(1:4096) "delimited.word(:field3,3)", 
  FIELD5 position(1:4096) "delimited.word(:field5,5)"
)
BEGINDATA
"Field1...","Field2...","Field3...","Field4...","Field5...","Field6..."
"xField1...","xField2...","xField3...","xField4...","xField5...","xField6..."
"yField1...","yField2...","yField3...","yField4...","yField5...","yField6..."


Basically, you will send the entire input line down to delimited.word (hence all fields are position(1:4096) or whatever you max line length is) and ask for the i'th column back from it. So now you can load field 1, 3, 5 or 5, 3, 1 or whatever. If you don't like the fact that it parses comma's, now you have the code and you can 'fix' it.

John L Dunn <misioror_at_gas.uug.arizona.edu> wrote:

>On Tue, 6 Feb 1996, Doug Harris wrote:
 

>> > We don't want to import B5--B10. How do we filter out the unwanted
>> > columns and still import B1--B4?
>>
>> This is easily done with awk and a named pipe (if you are running unix
>> ...please include OS and other relevant info such as versions with
>> specific technical questions like this).
>> --
 

>Whoops!! Sorry...
 

>We're running Windows NT Server 3.51, Oracle Workgroup Server V7.1.3
 

>John

Thomas Kyte
tkyte_at_us.oracle.com
Oracle Government Received on Tue Feb 06 1996 - 00:00:00 CET

Original text of this message