Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: SQL*Loader - import csv

Re: SQL*Loader - import csv

From: Thomas Kyte <tkyte_at_us.oracle.com>
Date: Thu, 02 Jul 1998 13:45:13 GMT
Message-ID: <359b8ec8.2146616@192.86.155.100>


A copy of this was sent to "Martin Dick" <martin.dick_at_LIKEABLESPAMneis.com.au> (if that email address didn't require changing) On Thu, 2 Jul 1998 14:44:06 +1000, you wrote:

>Hi,
> I've been trying to do an import through SQL Loader of a CSV file that
>contains too many columns for the table I'm importing into.
>I.e. I have a table of 5 columns, the datafile has 8 columns.
>I need to skip the 1st, 3rd and 4th columns in the .CSV file.
>
>At the moment I import the CSV into a temporary table, then do an insert
>from there.
>
>Is there anyway to alter the control file to make it skip the required
>columns?
>
>I can't alter the format of the CSV, the commas are the only column
>identifiers, no " and the data is variable length.
>
>
>Does anyone out there have a better solution???
>
>Thanks,
>Martin

You can do it with pl/sql and sqlldr pretty easily if you are using conventional path loads. Consider the following .ctl file:

LOAD DATA
INFILE *
REPLACE
INTO TABLE DELIMITED_TEST
(
  FIELD1 position(1:4096) "delimited.word(:field1,1,NULL,chr(9))",   FIELD2 position(1:1) "delimited.word(:field1,4,NULL,chr(9))" )
BEGINDATA

John Doe	JunkField#1	MoreJunk	123 Main Street
Mrs. ReallyLongLastName	xxx	short	345 River Road


This will load the fields 1 and 4 from the above (separated by tabs). The delimited package follows below. Note the use of field1 in all of the calls to delimited.word. This will load a file with lines upto 4k in length (change the 4096 to bigger to allow for longer linesizes, upto 32k )The main routine, word, takes as inputs:

the entire input line.
What field from the line you want.
What character MIGHT enclose fields (defaults to a single quote). What character separates fields (defaults to a comma).

The word subroutine looks at the line, and if it is different from the one it already has parsed, breaks it up into 'words' based on the enclosure and separator.

If the line is the same (as it will be for a single insert of >1 column), it simply returns the array entry corresponding to that 'word'.

Hope this helps

create or replace package delimited
as

	function word( p_str 			in varchar2, 
				   p_n   			in varchar2, 
				   p_enclosed_by 	in varchar2 default '''', 
				   p_separated_by   in varchar2 default ',' )
	return varchar2;

	pragma restrict_references( word, WNDS, RNDS );
end;
/

create or replace package body delimited as

        type vcArray is table of varchar2(2000) index by binary_integer;

	g_words			vcArray;
	g_empty			vcArray;
	g_last_string	varchar2(4096);

	function de_quote( p_str in varchar2, p_enc_by in varchar2 ) return varchar2
	is
		l_enc_len number default nvl(length(p_enc_by),0);
		l_str_len number default nvl(length(p_str),0);
		l_tmp_str varchar2(200) default
				  substr( p_str, l_enc_len+1, l_str_len-2*l_enc_len );
	begin
		return replace( l_tmp_str, p_enc_by||p_enc_by, p_enc_by );
	end de_quote;


	procedure parse( p_str 	 in varchar2, 
					 p_delim in varchar2, 
					 p_sep   in varchar2 )
	is
		l_n		number  default 1;
		l_in_quote	boolean default FALSE;
		l_ch		char(1);
		l_len		number default nvl(length( p_str ),0);
	begin
		if ( l_len = 0 ) then
			return;
		end if;

		g_words := g_empty;
		g_words(1) := NULL;

		for i in 1 .. l_len loop
			l_ch := substr( p_str, i, 1 );
			if ( l_ch = p_delim ) then
				l_in_quote := NOT l_in_quote;
			end if;
			if ( l_ch = p_sep AND NOT l_in_quote ) then
				l_n := l_n + 1;
				g_words(l_n) := NULL;
			else
				g_words(l_n) := g_words(l_n)||l_ch;
			end if;
		end loop;

		for i in 1 .. l_n loop
			g_words(i) := de_quote( g_words(i), p_delim );
		end loop;
	end parse;


	function word( p_str 			in varchar2, 
				   p_n   			in varchar2, 
				   p_enclosed_by 	in varchar2 default '''', 
				   p_separated_by   in varchar2 default ',' ) return varchar2
	is
	begin
		if ( g_last_string is NULL or p_str <> g_last_string ) then
			g_last_string := p_str;
			parse( p_str, p_enclosed_by, p_separated_by );
		end if;
		return g_words( p_n );
	end;

end delimited;
/  

Thomas Kyte
tkyte_at_us.oracle.com
Oracle Government
Herndon VA  

http://govt.us.oracle.com/ -- downloadable utilities  



Opinions are mine and do not necessarily reflect those of Oracle Corporation  

Anti-Anti Spam Msg: if you want an answer emailed to you, you have to make it easy to get email to you. Any bounced email will be treated the same way i treat SPAM-- I delete it. Received on Thu Jul 02 1998 - 08:45:13 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US