Re: OPO - filling in tables - how?

From: Kasu Sista <sista_at_links.com>
Date: 1996/04/06
Message-ID: <4k5hgg$8al_at_nntp.interaccess.com>

>In article <4i9g76$mtn_at_news.nstn.ca>, rob_pksa_at_fox.nstn.ca wrote:
>
>> I have a lot of data I need to put into some tables in OPO. Is there
an
>> automatic way of doing this? SQL*loader won't help because we are
>> developing on a standalone machine using the Blaze database.

Here is a document from Oracle that has some info. on loading text files.

Kasu Sista
Links Technology Corp.

Document ID:         9289058.61
Title:               APPLE&WIN: Loading a Text File Using Oracle Power 
Objects
Creation Date:       1 September 1995 
Last Revision Date:  20 March 1996
Revision Number:     2
Product:             Oracle Power Objects 
Product Version:     1.0.4 
Platform:            MACINTOSH AND WINDOWS
Information Type:    ADVISORY 
Impact:              HIGH 
Abstract:            This document provides information and Oracle Basic 
                     examples on how to load data from a formatted flat 
                     text file using Power Objects. 
Keywords:            POWER;OBJECTS;OPO;SQLLOAD;LOADER;TEXT;FILE;LOAD 
_________________________________________________________________________

___

Introduction

This document provides guidance and Oracle Basic examples which allow you to load data from a flat text file to a Blaze database using Oracle Power Objects. This document assumes that you are comfortable programming with Oracle Basic and that Oracle Power Objects v1.0 has already been installed.

The information and examples provided here are for illustrative purposes, and are by no means the best or the most efficient way this task can be accomplished. Feel free to adapt the examples where necessary to suit your own requirements.

Text File Formats

There are a variety of formats with which data can be stored in flat text files. Only the two most common formats, comma-delimited data and fixed-field data, will be discussed here. The examples provided here for these two formats should provide enough ideas for working with other text file formats.

The (comma) delimited text file format is perhaps the most common format, and often can be easily generated by a variety of applications. For example, information in a Microsoft Excel spreadsheet can be saved in CSV format, which is just a comma-delimited text file with data optionally enclosed in double quotes to reduce ambiguity. Consider the following table in an Oracle7 database:

     ID    NAME               HIREDATE 
     --    ---------------    --------- 
     01    SMITH, JOHN        01-Jan-90 
     02    THE "TIGER"        02-Feb-87 
     03    SCOTT              14-Jul-45

In a comma-delimited (w/ double quotes where necessary), the above data would appear as:

     01,"SMITH, JOHN",01-Jan-90 
     02,"THE ""TIGER""",02-Feb-87 
     03,SCOTT,14-Jul-45

You will notice that enclosing double quotes were added to fields which contained either a comma (the delimiter) or double quotes. Also notice that with 'THE "TIGER"', the double quote in the data were replaced with two pairs of double quotes ("") to indicate that they are part of the data. Another common delimiter used in place of the comma is the TAB character.

The fixed-field (or fixed-width) text file format is another important format. With a fixed-field file, a START position and END position (or field WIDTH) is specified for each field to indicate how the data is to be interpreted. In a fixed-field text file, the data above may appear as:

    01   SMITH, JOHN    01-Jan-90 
    02   THE "TIGER"    02-Feb-87 
    03   SCOTT          14-Jul-45

The format of this particular file can be described as:

    ID         POSITION  1 to  4 
    NAME       POSITION  6 to 18

HIREDATE POSITION 21 to 30

Although the fixed-field format is not as common as the delimited format, there are some occasions where it is easier or more convenient to generate a fixed-field text file. Consider a C application, or even SQL*Plus as an example: it is more complicated to generate delimited data because of the cases where the delimiter (comma) or enclosure characters (single or double quotes) are actually part of the data to be written out. Generating data in the fixed-field format from these applications however is trivial by comparison.

Loading Comma-Delimited Data

Although this task appears daunting, it is not all that difficult to accomplish. The only requirement I had was for the example to be easily modified to accomodate a variety of table definitions and data formats - in my terms, easy to modify means a few minutes to adapt the code to different table definitions. Therefore, as you look through the following Oracle Basic examples, do keep in mind that efficiency and speed have been sacrificed to satisfy this requirement.

There are three steps (repeated over and over again) which occur in this task:

    (1)  Reading data from the text file, line by line 
    (2)  Parsing a text line and pulling out data 
    (3)  Inserting the data into a Blaze or Oracle7 database - with data 
         conversion if needed

The most difficult step here is (2) parsing the text line. One approach is to write a function which will parse the text line into all its component fields, and then return them all. The disadvantage of this approach is that it is very tied to the structure of the table which is being loaded with data.

A more flexible approach, which I use, is to write a function which will parse the text line for a specific column, and then to call that function repeatedly to retrieve columns one by one. The advantage of this approach is that the parsing function is no longer tied to the table definition - to adapt to various table definitions, you simply call this parsing more often per row, or less often depending on the number of columns to retrieve. The disadvantage of this approach is that it can be rather inefficient - a text line may be scanned many times if there are many columns to be retrieved.

Here are the two Oracle Basic routines which implement the load of the delimited text file example in the previous section above (using the comma as the delimiter and double quote for enclosure):

sub udmLoadDelimited()

'  This is an example of how to load a delimited text file (such as the 
'  CSV file format from Microsoft Excel.  Here the comma (,) is used as 
'  the delimiter, and double quotes (") used for (optional) enclosure. 
' 
'  Only minimal error-checking is implemented in this example - this 
'  means that if an error occurs during the load, the routine will exit. 
' 
'  Outline: 
'    (1) Read a line of data from the file 
'    (2) Call the parse function repeatedly to retrieve all necessary 
'        columns 
'    (3) Insert the data into the database 
'

Inherited.Click()
DIM sourceFileName, lineData, data1, data2, data3 as String

' Open file with read-only access (buffer length of 1024 bytes) sourceFileName = "Power:Oracle Data:Example 1" ON ERROR GOTO openCloseError ' Check to make sure file exists OPEN sourceFileName FOR input ACCESS read AS #1 LEN=1024

ON ERROR GOTO readError ' Check for general errors or end of file WHILE (TRUE)
LINE INPUT # 1, lineData

    data1 = udmParseDelimited(lineData, 1) 
    data2 = udmParseDelimitedlineData, 2) 
    data3 = udmParseDelimited(lineData, 3) 
    EXEC SQL WHENEVER SQLERROR GOTO aSQLError 
    EXEC SQL INSERT INTO example VALUES                          & 
            (:data1, :data2, TO_CHAR(:data3, 'dd-mon-yy'))

WEND ' Note, only way out of loop is when an error occurs

openCloseError:

msgbox("Error opening/closing file - ERR = " & str(ERR)) GOTO finished:

readError:            ' File read error occurs; 1074 is no more data 
    if ERR = 1074 then goto closeFile:          ' no more data

msgbox("Error reading file - ERR = " & str(ERR)) GOTO CloseFile:

aSQLError:

    msgbox("SQL Error occurred:  Code = " & str(SqlErrCode()) &      & 
            ", Text = " & SqlErrText())

GOTO closeFile:

closeFile:
EXEC SQL COMMIT
ON ERROR GOTO openCloseError

CLOSE # 1                      ' Fall through to finished:

finished:

exit sub

function udmParseDelimited(sourceString as String, whichColumn as Integer)

as String

-
' This function parses through a line of delimited data (sourceString) to
' retrieve a specific column (which Column.) All data (regardless of ' destination column type) is returned to calling function as String. It

'  is up to the calling function to convert it to other datatypes. 
' 
'  Outline: 
'    (1) Find the START and END position of the specified column in the 
'        given line of data 
'    (2) Remove (optional) double quote (enclosure) characters 
'    (3) Replace ("") with (") 
'

DIM inQuotes, sourceLen, curColumn, startPos, endPos AS Integer DIM theChar, midString, finalString AS String

if whichColumn <= 0 then ' some basic error-checking

udmParseDelimited = NULL
exit function
end if

' Find START position of desired column: sourceLen = LEN(sourceString)
inQuotes = 0
curColumn = 1
startPos = 0

for t = 1 to sourceLen step 1

    if curColumn = whichColumn then           ' Are we done? 
        startPos = t 
        exit for

end if

    theChar = MID(sourceString, t, 1)       ' get a single character 
    select case theChar          ' is it the delimiter or enclosure? 
        CASE chr(34): 
            inQuotes = iif(inQuotes = 0, 1, 0)  ' inQuotes := NOT 
inQuotes 
        CASE ",": 
            if (inQuotes = 0) then curColumn = curColumn + 1

end select
next t

if startPos = 0 then ' hit the end without finding the column

udmParseDelimited = NULL
exit function
end if

' Find END position of desired column:
inQuotes =0
endPos = 0
for t = startPos to sourceLen step 1

theChar = MID(sourceString, t, 1)
select case theChar

        CASE chr(34): 
            inQuotes = iif(inQuotes = 0, 1, 0) 
        CASE ",": 
            if (inQuotes = 0) then 
                endPos = t - 1 
                exit for 
            end if

end select
next t

if endPos = 0 then       '  hit the end of string (i.e., last column) 
    endPos = t - 1       ' t = sourceLen + 1 when done with the for loop 
end if

if startPos > endPos then ' if startPos = endPos the length is 1!

udmParseDelimited = NULL
exit function
end if

' Remove ENCLOSURE characters
midString = MID(sourceString, startPos, endPos - startPos + 1) if left(midString, 1) = chr(34) and right(midString, 1) = chr(34) then

midString = MID(midString, 2, LEN(midString) - 2) end if

' Substitute ("") with (")
sourceLen = LEN(midString)
inQuotes = 0
finalString = "" ' initialize to 0-length string for t = 1 to sourceLen step 1

theChar = MID(midString, t, 1)
select case theChar

        CASE chr(34): 
            if inQuotes = 1 then 
                inQuotes = 0 
            else 
                inQuotes = 1 
                finalString = finalString & chr(34) 
            end if 
        CASE ELSE 
            finalString = finalString & theChar

end select
next t

' Done - return data;
udmParseDelimited = finalString
' Note, if we have a 0-length string at this point, should we return ' NULL? Returning 0-length string for now.

As you may notice in the above example, the only changes which need to be made to accomodate a different Table structure is in udmLoadDelimited() in the (repeated) calls to udmParseDelimited() and the EXEC SQL INSERT statement.

Another point worth mentioning is that in this example, even though ID is actually a NUMBER in the database, it is stored in a String (data1) in the INSERT - the reason is that Blaze and Oracle7 will be able to handle the conversion appropriately. The only exception to this is the date datatype, which should probably be converted manually (in the EXEC SQL INSERT command) - use the SQL function TO_DATE() to ensure greater accuracy in handling date/time data.

Loading Fixed-Field Data

We can use the same approach to loading fixed-field data as used above for delimited data. Again, the only requirement is for the example to be easily modified to accomodate different table definitions.

Here are the two Oracle Basic routines which implement the load of the fixed-field text file example in the previous section above:

sub udmLoadFixed()

'  This is an example of how to load a fixed-field text file.  To 
'  retrieve a particular column specify a START and END position when 
'  calling the udmParseFixed() function - note, column positions are 
'  1-based, therefore the first character in a line is a position 1. 
' 
'  Only minimal error-checking is implemented in this example - this 
'  means that if an error occurs during the load, the routine will exit. 
' 
'  Outline: 
'    (1) Read a line of data from the file 
'    (2) Call the parse function repeatedly to retrieve all necessary 
'        columns 
'    (3) Insert the data into the database 
'

Inherited.Click()
DIM sourceFileName, lineData, data1, data2, data3 as String

' Open file with read-only access (buffer length of 1024 bytes) sourceFileName = "Power:Oracle Data:Example 2" ON ERROR GOTO openCloseError ' Check to make sure file exists OPEN sourceFileName FOR input ACCESS read AS #2 LEN=1024

ON ERROR GOTO readError ' Check for general errors or end of file WHILE (TRUE)
LINE INPUT #2, lineData

    data1 = udmParseFixed(lineData, 1, 4) 
    data2 = udmParseFixed(lineData, 6, 18) 
    data3 = udmParseFixed(lineData, 21, 30) 
    EXEC SQL WHENEVER SQLERROR GOTO aSQLError 
    EXEC SQL INSERT INTO example VALUES                          & 
            (:data1, :data2, TO_CHAR(:data3, 'dd-mon-yy'))

WEND ' Note, only way out of loop is when an error occurs

openCloseError:

msgbox("Error opening/closing file - ERR = " & str(ERR)) GOTO finished:

readError:            ' File read error occurs; 1074 is no more data 
    if ERR = 1074 then goto closeFile:          ' no more data

msgbox("Error reading file - ERR = " & str(ERR)) GOTO CloseFile:

aSQLError:

    msgbox("SQL Error occurred:  Code = " & str(SqlErrCode()) &      & 
            ", Text = " & SqlErrText())

GOTO closeFile:

closeFile:
EXEC SQL COMMIT
ON ERROR GOTO openCloseError

CLOSE # 2                      ' Fall through to finished:

finished:

exit sub

function udmParseFixed(sourceString as String, startPos as Integer,

endPos as Integer) as String

'  This function retrieves data based on START and END position - column 
'  position are all 1-based, meaning the first character of a line is at 
'  position 1.  If the end position is past the end of the string, the 
'  available data will be returned.  If start position is past the end 
'  of the string, a NULL is returned.  This function also trims trailing 
'  spaces (but not leading spaces) - if in doing so the length of the 
'  string becomes 0, a NULL will be returned. 
' 
'  Outline: 
'    (1) Grab desired substring of START and END position 
'    (2) Trim trailing spaces 
'

DIM grabLength, sourceLen, stringRemain AS Integer DIM finalString AS String

if startPos > endPos then ' invalid start/end positions

udmParseFixed = NULL
exit function
end if

sourceLen = LEN(sourceString)
if startPos > sourceLen then ' start position past end - return NULL

udmParseFixed = NULL
exit function
end if

grabLength = endPos - startPos + 1
stringRemain = sourceLen - startPos + 1 if grabLength > stringRemain then grabLength = stringRemain finalString = RTRIM(MID(sourceString, startPos, grabLength))

if LEN(finalString) = 0 then ' return NULL if 0-length string

udmParseFixed = NULL
exit function
end if

udmParseFixed = finalString

                                          Oracle Worldwide Customer

Support Received on Sat Apr 06 1996 - 00:00:00 CEST

This message: [ Message body ]
Next message: John Flack: "Re: Need advice on RAID5"
Previous message: Jeffrey M. Stander: "Re: DB Design question"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message