Re: SQL*Loader can a comma delimited file be imported and skip selected fields?

From: Thomas Kyte <tkyte_at_us.oracle.com>
Date: Fri, 28 May 1999 12:36:54 GMT
Message-ID: <37568dd0.5969483_at_newshost.us.oracle.com>


A copy of this was sent to "Jeff Boehlert" <boehlert_at_peak.org> (if that email address didn't require changing) On Thu, 27 May 1999 22:32:13 -0700, you wrote:

>All:
>
>Is it possible to import a comma delimited file, optionally enclosed with ",
>variable length fields and ignore select fields?
>
>Example file:
>1,234.56,"123 Elm St",ABC,345
>2,1000.45,"Main St",A,5
>
>Destination table includes only two columns ROW_ID and STREET, to those
>columns I want to map fields 1 and 3. Fields 2,4 and 5 are to be ignored.
>
>Thank You,
>Jeff Boehlert
>jboehler_at_ctron.com

sqlldr by itself cannot do this with delimited data until Oracle8i, release 8.1 (added a keyword FILLER for this).

If you are using UNIX, using "cut" on the file and piping the results of this into a named pipe (and having sqlldr read that pipe) is pretty efficient. For example:

$ mknod sqlldr.dat p
$ cut -f3- -d, < test.dat > sqlldr.dat & [1] 4946

$ sqlldr userid=tkyte/tkyte control=sqlldr.ctl data=sqlldr.dat

would just let sqlldr have at columns 3 on in the file test.dat (separated by commas)...

Also, you can do it with pl/sql and sqlldr pretty easily. Consider the following .ctl file:

LOAD DATA
INFILE *
REPLACE
INTO TABLE DELIMITED_TEST
(
  FIELD1 position(1:4096) "delimited.word(:field1,1,NULL,chr(9))",   FIELD2 position(1:1) "delimited.word(:field1,4,NULL,chr(9))" )
BEGINDATA

John Doe	JunkField#1	MoreJunk	123 Main Street
Mrs. ReallyLongLastName	xxx	short	345 River Road


This will load the fields 1 and 4 from the above (separated by tabs). The delimited package follows below. Note the use of field1 in all of the calls to delimited.word. This will load a file with lines upto 4k in length (change the 4096 to bigger to allow for longer linesizes, upto 32k )The main routine, word, takes as inputs:

the entire input line.
What field from the line you want.
What character MIGHT enclose fields (defaults to a single quote). What character separates fields (defaults to a comma).

The word subroutine looks at the line, and if it is different from the one it already has parsed, breaks it up into 'words' based on the enclosure and separator.

If the line is the same (as it will be for a single insert of >1 column), it simply returns the array entry corresponding to that 'word'.

Hope this helps

create or replace package delimited
as

	function word( p_str 			in varchar2, 
				   p_n   			in varchar2, 
				   p_enclosed_by 	in varchar2 default '''', 
				   p_separated_by   in varchar2 default ',' )
	return varchar2;

	pragma restrict_references( word, WNDS, RNDS );
end;
/

create or replace package body delimited as

        type vcArray is table of varchar2(2000) index by binary_integer;

	g_words			vcArray;
	g_empty			vcArray;
	g_last_string	varchar2(4096);

	function de_quote( p_str in varchar2, p_enc_by in varchar2 ) return varchar2
	is
		l_enc_len number default nvl(length(p_enc_by),0);
		l_str_len number default nvl(length(p_str),0);
		l_tmp_str varchar2(200) default
				  substr( p_str, l_enc_len+1, l_str_len-2*l_enc_len );
	begin
		return replace( l_tmp_str, p_enc_by||p_enc_by, p_enc_by );
	end de_quote;


	procedure parse( p_str 	 in varchar2, 
					 p_delim in varchar2, 
					 p_sep   in varchar2 )
	is
		l_n		number  default 1;
		l_in_quote	boolean default FALSE;
		l_ch		char(1);
		l_len		number default nvl(length( p_str ),0);
	begin
		if ( l_len = 0 ) then
			return;
		end if;

		g_words := g_empty;
		g_words(1) := NULL;

		for i in 1 .. l_len loop
			l_ch := substr( p_str, i, 1 );
			if ( l_ch = p_delim ) then
				l_in_quote := NOT l_in_quote;
			end if;
			if ( l_ch = p_sep AND NOT l_in_quote ) then
				l_n := l_n + 1;
				g_words(l_n) := NULL;
			else
				g_words(l_n) := g_words(l_n)||l_ch;
			end if;
		end loop;

		for i in 1 .. l_n loop
			g_words(i) := de_quote( g_words(i), p_delim );
		end loop;
	end parse;


	function word( p_str 			in varchar2, 
				   p_n   			in varchar2, 
				   p_enclosed_by 	in varchar2 default '''', 
				   p_separated_by   in varchar2 default ',' ) return varchar2
	is
	begin
		if ( g_last_string is NULL or p_str <> g_last_string ) then
			g_last_string := p_str;
			parse( p_str, p_enclosed_by, p_separated_by );
		end if;
		return g_words( p_n );
	end;

end delimited;
/

See http://www.oracle.com/ideveloper/ for my column 'Digging-in to Oracle8i'...  

Thomas Kyte
tkyte_at_us.oracle.com
Oracle Service Industries
Reston, VA USA

-- 
Opinions are mine and do not necessarily reflect those of Oracle Corporation
Received on Fri May 28 1999 - 14:36:54 CEST

Original text of this message