Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
Home -> Community -> Mailing Lists -> Oracle-L -> RE: Guys - loading comma-delimited - dynamically generate table
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
------_=_NextPart_001_01C31B7A.B8FDD9B0
Content-Type: text/plain; charset="iso-8859-1"
Paula
I've just recently written a little awk script to do what I think you're after... I might get shot for using awk ;) but at least I can guarantee that it will be installed on any unix box I come to.
Regards
David Lord
#!/usr/bin/nawk -f
############################################################################
####
# NAME
#
# loadcsv.awk - generate scripts to create a table and load a CSV file into
it
#
# SYNOPSIS
#
# loadcsv.awk <filename>.csv
#
# DESCRIPTION
#
# Generates two output files: -
#
# 1) A SQL script named <filename>.sql (where <filename> is the basename of
the
# input file) containing statements to create a table suitable for loading
the
# data into. The table is named <filename>. All columns are varchar2's
large
# enough to hold the maximum length of the column. The names of the columns
# are taken from the first line of the CSV file, tidied up to remove leading
# and trailing blanks and replace spaces with underscores. If any column in
the
# CSV file is empty, the column is not added to the table.
#
# 2) A SQL*Loader control file named <filename>.ctl for loading the data
into
# the table. Fields which are empty are specified as FILLER fields.
#
# OPTIONS
#
# filename.csv - The name of a csv file. It must have the column names
# as the first line.
#
# EXAMPLE
#
# First, run the script: -
#
# % loadcsv.awk myfile.csv
#
# Then, login to sqlplus and run the SQL script to create the table: -
#
# SQL> @myfile.sql
#
# Finally, use sqlldr to load the data into the table: -
#
# % sqlldr scott/tiger myfile.ctl data = myfile.csv
#
# KNOWN ISSUES
#
# This script does not check that the length of the columns is sensible.
So,
# if any cell contains more than 4000 characters, the create table command
will
# fail.
#
# AUTHOR
#
# David Lord, April 2003
############################################################################
####
BEGIN {
# Field separator is a comma
FS = ",";
# Get the name of the input file and strip the .csv
if(ARGC > 1) {
filename = ARGV[1];
gsub(/\.csv/, "", filename);
} else {
filename = FILENAME;
}
# The name of the 'create table' file
sqlfilename = filename ".sql";
# The name of the sqlloader control file
ctlfilename = filename ".ctl";
}
# The first line contains the column names
NR == 1 {
num_cols = NF;
for(i = 1; i <= num_cols; i++) {
# Tidy up the column name
gsub(/ /, "_", $i); gsub(/_*$/, "", $i); gsub(/^_*/, "", $i);
# Add to an array of column names
col_names[i] = tolower($i);
# Initialise the column length
col_lengths[i] = 0;
}
}
# Find the maximum length of each column
NR > 1 {
i = 1;
j = 1;
while(i <= NF) {
col_length = 0;
if($i ~ /^ *\"/) {
gsub(/^ *\"/, "", $i); while(i <= NF && $i !~ /\" *$/) { col_length += length($i) + 1; i++; } gsub(/\" *$/, "", $i);
# Print the scripts
END {
# Begin the create table statement
printf("create table " filename "(") > sqlfilename;
firstcol = 1;
# Loop over the columns
for(i = 1; i <= num_cols; i++) {
col_name = col_names[i];
col_length = col_lengths[i];
# Ignore empty columns
if(col_length > 0) {
# Put a comma on the end of every line but the first if(firstcol == 1) { firstcol = 0; } else { printf(",") >> sqlfilename; } # Print the column definition printf("\n\t%-30s %s", col_name, "varchar2(" col_length ")") >>sqlfilename;
}
}
# Finish the create table statement
printf("\n);\n") >> sqlfilename;
# Begin the control file
printf("options (skip=1)") > ctlfilename; printf("\nload data") >> ctlfilename; printf("\nappend") >> ctlfilename; printf("\ninto table %s", filename) >> ctlfilename; printf("\nfields terminated by \",\" optionally enclosed by '\"'") >>ctlfilename;
# Loop over the columns
firstcol = 1;
for(i = 1; i <= num_cols; i++) {
col_name = col_names[i];
col_length = col_lengths[i];
# Put a comma on the end of every line but the first
if(firstcol == 1) {
firstcol = 0;
} else {
printf(",") >> ctlfilename;
}
# Print the column definition
printf("\n\t%s", col_name) >> ctlfilename;
# If the column is empty use a filler field
if(col_length == 0) {
printf(" filler") >> ctlfilename;
}
}
# Finish the control file
printf("\n)\n") >> ctlfilename;
}
-----Original Message-----
Sent: 15 May 2003 15:42
To: Multiple recipients of list ORACLE-L
Am I dreaming, hoping? Wasn't it possible to take a comma-dlt. text file and from the column list dynamically generate a table then load the data. Hmmmm. I could write that pretty easily but problem is wouldn't necessarily know datatypes and sizes. Okay - I guess I am dreaming.
Data Warehouse builder versus data migration tool:
-when would you use one or the other?
-is the juice worth the squeeze or am I better off doing myself as I have
last 6 years
-extra purchase - which one?
Thanks,
Paula
A member of the Hays plc group of companies. Hays plc is registered in England and Wales number 2150950. Registered Office Hays House Millmead Guildford Surrey GU2 4HJ.
------_=_NextPart_001_01C31B7A.B8FDD9B0
Content-Type: text/html; charset="iso-8859-1"
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML><HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1"> <TITLE>RE: Guys - loading comma-delimited - dynamically generate table</TITLE> <META content="MSHTML 6.00.2800.1106" name=GENERATOR></HEAD><BODY>
-f<BR>################################################################################<BR>#NAME<BR>#<BR># loadcsv.awk - generate scripts to create a table and load a CSV file into it<BR>#<BR># SYNOPSIS<BR>#<BR># loadcsv.awk <filename>.csv<BR>#<BR># DESCRIPTION<BR>#<BR># Generates two output files: -<BR>#<BR># 1) A SQL script named <filename>.sql (where <filename> is the basename of the<BR># input file) containing statements to create a table suitable for loading the<BR># data into. The table is named <filename>. All columns are varchar2's large<BR># enough to hold the maximum length of the column. The names of the columns<BR># are taken from the first line of the CSV file, tidied up to remove leading<BR># and trailing blanks and replace spaces with underscores. If any column in the<BR># CSV file is empty, the column is not added to the table.<BR>#<BR># 2) A SQL*Loader control file named <filename>.ctl for loading the data into <BR># the table. Fields which are empty are specified as FILLER fields.<BR>#<BR># OPTIONS<BR>#<BR># filename.csv - The name of a csv file. It must have the column
2003<BR>################################################################################</FONT></SPAN></DIV> <DIV> </DIV> <DIV><SPAN class=154191307-16052003><FONT face="Courier New" size=2>BEGIN {<BR> # Field separator is a comma<BR> FS = ",";</FONT></SPAN></DIV><DIV> </DIV>
}<BR> }</FONT></SPAN></DIV> <DIV> </DIV> <DIV><SPAN class=154191307-16052003><FONT face="Courier New" size=2> #Finish the create table statement<BR> printf("\n);\n") >> sqlfilename;</FONT></SPAN></DIV>
num_cols; i++) {<BR> col_name = col_names[i];<BR> col_length = col_lengths[i];<BR> <BR> # Put a comma onthe end of every line but the first<BR> if(firstcol == 1) {<BR> firstcol = 0;<BR> } else {<BR> printf(",") >> ctlfilename;<BR> }<BR> <BR> # Print the column definition<BR> printf("\n\t%s", col_name) >> ctlfilename;</FONT></SPAN></DIV> <DIV> </DIV>
**********************************************************************<BR>This message (including any attachments) is confidential and may be <BR> legally privileged. If you are not the intended recipient, you should <BR> not disclose, copy or use any part of it - please delete all copies <BR> immediately and notify the Hays Group Email Helpdesk at<BR> email.helpdesk_at_hays.com<BR>
**********************************************************************<BR></FONT></CODE>
------_=_NextPart_001_01C31B7A.B8FDD9B0--
-- Please see the official ORACLE-L FAQ: http://www.orafaq.net -- Author: Lord, David - CSG INET: david.lord_at_hays.com Fat City Network Services -- 858-538-5051 http://www.fatcity.com San Diego, California -- Mailing list and web hosting services --------------------------------------------------------------------- To REMOVE yourself from this mailing list, send an E-Mail message to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in the message BODY, include a line containing: UNSUB ORACLE-L (or the name of mailing list you want to be removed from). You may also send the HELP command for other information (like subscribing).Received on Fri May 16 2003 - 03:17:09 CDT