Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: Guys - loading comma-delimited - dynamically generate table

RE: Guys - loading comma-delimited - dynamically generate table

From: <Paula_Stankus_at_doh.state.fl.us>
Date: Fri, 16 May 2003 07:41:47 -0800
Message-ID: <F001.0059AAFC.20030516074147@fatcity.com>


Thanks,  

I haven't given up on using awk either yet.  

Oracle OCP DBA

-----Original Message-----
Sent: Friday, May 16, 2003 3:14 AM
To: ORACLE-L_at_fatcity.com; Paula_Stankus_at_doh.state.fl.us

Paula  

I've just recently written a little awk script to do what I think you're after... I might get shot for using awk ;) but at least I can guarantee that it will be installed on any unix box I come to.  

Regards
David Lord  

#!/usr/bin/nawk -f
############################################################################
####
# NAME
#
# loadcsv.awk - generate scripts to create a table and load a CSV file into
it
#
# SYNOPSIS
#
# loadcsv.awk <filename>.csv
#
# DESCRIPTION
#
# Generates two output files: -
#
# 1) A SQL script named <filename>.sql (where <filename> is the basename of
the
# input file) containing statements to create a table suitable for loading
the
# data into. The table is named <filename>. All columns are varchar2's
large
# enough to hold the maximum length of the column. The names of the columns
# are taken from the first line of the CSV file, tidied up to remove leading
# and trailing blanks and replace spaces with underscores. If any column in
the
# CSV file is empty, the column is not added to the table.
#
# 2) A SQL*Loader control file named <filename>.ctl for loading the data
into
# the table. Fields which are empty are specified as FILLER fields.
#
# OPTIONS
#
# filename.csv - The name of a csv file. It must have the column names
# as the first line.
#
# EXAMPLE
#
# First, run the script: -
#
# % loadcsv.awk myfile.csv
#
# Then, login to sqlplus and run the SQL script to create the table: -
#
# SQL> @myfile.sql
#
# Finally, use sqlldr to load the data into the table: -
#
# % sqlldr scott/tiger myfile.ctl data = myfile.csv
#
# KNOWN ISSUES
#
# This script does not check that the length of the columns is sensible.
So,
# if any cell contains more than 4000 characters, the create table command
will
# fail.
#
# AUTHOR
#
# David Lord, April 2003
############################################################################
####
 

BEGIN {
  # Field separator is a comma
  FS = ",";  

  # Get the name of the input file and strip the .csv   if(ARGC > 1) {
    filename = ARGV[1];
    gsub(/\.csv/, "", filename);
  } else {
    filename = FILENAME;
  }  

  # The name of the 'create table' file
  sqlfilename = filename ".sql";  

  # The name of the sqlloader control file   ctlfilename = filename ".ctl";
}  

# The first line contains the column names
NR == 1 {
  num_cols = NF;
  for(i = 1; i <= num_cols; i++) {
    # Tidy up the column name

    gsub(/ /, "_", $i);
    gsub(/_*$/, "", $i);
    gsub(/^_*/, "", $i);
 

    # Add to an array of column names
    col_names[i] = tolower($i);  

    # Initialise the column length
    col_lengths[i] = 0;
  }
}  

# Find the maximum length of each column
NR > 1 {
  i = 1;
  j = 1;
  while(i <= NF) {
    col_length = 0;
    if($i ~ /^ *\"/) {

      gsub(/^ *\"/, "", $i);
      while(i <= NF && $i !~ /\" *$/) {
        col_length += length($i) + 1;
        i++;
      }
      gsub(/\" *$/, "", $i);

    }
    col_length += length($i);
    if(col_length > col_lengths[j]) {
      col_lengths[j] = col_length;
    }
    i++;
    j++;
  }
}  

# Print the scripts

END {
  # Begin the create table statement
  printf("create table " filename "(") > sqlfilename;   firstcol = 1;  

  # Loop over the columns
  for(i = 1; i <= num_cols; i++) {
    col_name = col_names[i];
    col_length = col_lengths[i];  

    # Ignore empty columns
    if(col_length > 0) {

      # Put a comma on the end of every line but the first
      if(firstcol == 1) {
        firstcol = 0;
      } else {
        printf(",") >> sqlfilename;
      }
 
      # Print the column definition
      printf("\n\t%-30s %s", col_name, "varchar2(" col_length ")") >>
sqlfilename;

    }
  }  

  # Finish the create table statement
  printf("\n);\n") >> sqlfilename;  

  # Begin the control file

  printf("options (skip=1)") > ctlfilename;
  printf("\nload data") >> ctlfilename;
  printf("\nappend") >> ctlfilename;
  printf("\ninto table %s", filename) >> ctlfilename;
  printf("\nfields terminated by \",\" optionally enclosed by '\"'") >>
ctlfilename;
  printf("\ntrailing nullcols") >> ctlfilename;   printf("\n(") >> ctlfilename;  

  # Loop over the columns
  firstcol = 1;
  for(i = 1; i <= num_cols; i++) {
    col_name = col_names[i];
    col_length = col_lengths[i];     

    # Put a comma on the end of every line but the first     if(firstcol == 1) {
      firstcol = 0;
    } else {
      printf(",") >> ctlfilename;
    }     

    # Print the column definition
    printf("\n\t%s", col_name) >> ctlfilename;  

    # If the column is empty use a filler field     if(col_length == 0) {
      printf(" filler") >> ctlfilename;
    }
  }  

  # Finish the control file
  printf("\n)\n") >> ctlfilename;
}

-----Original Message-----
Sent: 15 May 2003 15:42
To: Multiple recipients of list ORACLE-L

Am I dreaming, hoping? Wasn't it possible to take a comma-dlt. text file and from the column list dynamically generate a table then load the data. Hmmmm. I could write that pretty easily but problem is wouldn't necessarily know datatypes and sizes. Okay - I guess I am dreaming.

Data Warehouse builder versus data migration tool:

-when would you use one or the other?
-is the juice worth the squeeze or am I better off doing myself as I have last 6 years
-extra purchase - which one?

Thanks,
Paula



This message (including any attachments) is confidential and may be legally privileged. If you are not the intended recipient, you should not disclose, copy or use any part of it - please delete all copies immediately and notify the Hays Group Email Helpdesk at email.helpdesk_at_hays.com
Any information, statements or opinions contained in this message (including any attachments) are given by the author. They are not given on behalf of Hays unless subsequently confirmed by an individual other than the author who is duly authorised to represent Hays.

A member of the Hays plc group of companies. Hays plc is registered in England and Wales number 2150950. Registered Office Hays House Millmead Guildford Surrey GU2 4HJ.


-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: 
  INET: Paula_Stankus_at_doh.state.fl.us

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
Received on Fri May 16 2003 - 10:41:47 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US