RE: dbf and cdx files

From: Stephen Lee <Stephen.Lee_at_DTAG.Com>
Date: Mon, 14 Jul 2003 15:40:43 -0500
Message-Id: <25920.337696@fatcity.com>

This message is in MIME format. Since your mail reader does not understand this format, some or all of this message may not be legible.

------_=_NextPart_000_01C34A48.3248690D
Content-Type: multipart/alternative;

boundary="----_=_NextPart_001_01C34A48.3248690D"

------_=_NextPart_001_01C34A48.3248690D
Content-Type: text/plain;

charset="iso-8859-1"

Attached (I hope) is a C prog I slapped together to take the contents of a DBF file and blow it out to a SQL Loader dat file. I got most of the info from asktom.oracle.com. He has a package posted there to read a DBF file and insert rows into an oracle table. It is a robust, general-purpose package, but because it inserts rows one at a time, it's as slow as Christmas. I use a ksh script to generate a control file (using the print header only option of the C program), start up sql loader and have it suck on a named pipe, generate sql loader dat info using all of the C program, and have it write to the named pipe.

You can compile the C program and see if you get what we would expect to get from the DBF files. The C program works entirely through command line redirection.

Example: c_prog [options] < dbf_file > output_file

-----Original Message-----
From: AK [mailto:oramagic_at_hotmail.com]
Sent: Monday, July 14, 2003 4:09 PM
To: Multiple recipients of list ORACLE-L Subject: Re: dbf and cdx files

Looks like foxpro files . Trying to open using odbc driver .

-ak

Original Message ----- From: Stephen <mailto:Stephen.Lee_at_DTAG.Com> Lee To: Multiple recipients of list ORACLE-L <mailto:ORACLE-L_at_fatcity.com> Sent: Monday, July 14, 2003 11:54 AM Subject: RE: dbf and cdx files

Are the dbf files dbase files? If so, I have something that you might be able to use or adapt to your purposes.

-----Original Message-----
From: AK [mailto:oramagic_at_hotmail.com]
Sent: Monday, July 14, 2003 1:29 PM
To: Multiple recipients of list ORACLE-L Subject: dbf and cdx files

I have got some *.cdx and *.dbf files to load . Any Idea what are these files and how can I open them ?

-ak

------_=_NextPart_001_01C34A48.3248690D
Content-Type: text/html;

charset="iso-8859-1"

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">

<META content="MSHTML 6.00.2722.900" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><SPAN class=821002920-14072003><FONT face=Courier size=2>Attached (I hope)
is a C prog I slapped together to take the contents of a DBF file and blow it out to a SQL Loader dat file.  I got most of the info from asktom.oracle.com.  He has a package posted there to read a DBF file and insert rows into an oracle table.  It is a robust, general-purpose package, but because it inserts rows one at a time, it's as slow as Christmas.  I use a ksh script to generate a control file (using the print header only option of the C program), start up sql loader and have it suck on a named pipe, generate sql loader dat info using all of the C program, and have it write to the named pipe.</FONT></SPAN></DIV>
<DIV><SPAN class=821002920-14072003><FONT face=Courier
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=821002920-14072003><FONT face=Courier size=2>You can compile
the C program and see if you get what we would expect to get from the DBF files.  The C program works entirely through command line redirection.</FONT></SPAN></DIV>
<DIV><SPAN class=821002920-14072003><FONT face=Courier
size=2></FONT></SPAN> </DIV>
<DIV><SPAN class=821002920-14072003><FONT face=Courier size=2>Example: c_prog
[options] < dbf_file > output_file</FONT></SPAN></DIV>
<BLOCKQUOTE dir=ltr

style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px"> <DIV class=OutlookMessageHeader dir=ltr align=left><FONT face=Tahoma size=2>-----Original Message-----<BR><B>From:</B> AK [mailto:oramagic_at_hotmail.com]<BR><B>Sent:</B> Monday, July 14, 2003 4:09 PM<BR><B>To:</B> Multiple recipients of list ORACLE-L<BR><B>Subject:</B> Re: dbf and cdx files<BR><BR></FONT></DIV> <DIV><FONT face=Arial size=2>Looks like foxpro files . Trying to open using odbc driver .</FONT></DIV>

  <DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
  <DIV><FONT face=Arial size=2>-ak</FONT></DIV>
  <BLOCKQUOTE dir=ltr

  style="PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">     <DIV style="FONT: 10pt arial">----- Original Message ----- </DIV>     <DIV
    style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:</B>     <A title=Stephen.Lee_at_DTAG.Com href="mailto:Stephen.Lee_at_DTAG.Com">Stephen     Lee</A> </DIV>
    <DIV style="FONT: 10pt arial"><B>To:</B> <A title=ORACLE-L_at_fatcity.com     href="mailto:ORACLE-L_at_fatcity.com">Multiple recipients of list ORACLE-L</A>     </DIV>
    <DIV style="FONT: 10pt arial"><B>Sent:</B> Monday, July 14, 2003 11:54     AM</DIV>

    <DIV style="FONT: 10pt arial"><B>Subject:</B> RE: dbf and cdx files</DIV>
    <DIV><BR></DIV>
    <DIV><SPAN class=165505617-14072003><FONT face=Courier size=2>Are the dbf

files dbase files?  If so, I have something that you might be able to use or adapt to your purposes.</FONT></SPAN></DIV> <BLOCKQUOTE dir=ltr
style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">

      <DIV class=OutlookMessageHeader dir=ltr align=left><FONT face=Tahoma 
      size=2>-----Original Message-----<BR><B>From:</B> AK 
      [mailto:oramagic_at_hotmail.com]<BR><B>Sent:</B> Monday, July 14, 2003 1:29 
      PM<BR><B>To:</B> Multiple recipients of list ORACLE-L<BR><B>Subject:</B> 
      dbf and cdx files<BR><BR></FONT></DIV>
      <DIV><FONT face=Arial size=2>I have got some *.cdx and *.dbf files to load 
      . Any Idea what are these files and how can I open them ?</FONT></DIV>
      <DIV><FONT face=Arial size=2></FONT>&nbsp;</DIV>
      <DIV><FONT face=Arial

size=2>-ak</FONT></DIV></BLOCKQUOTE></BLOCKQUOTE></BLOCKQUOTE></BODY></HTML>

------_=_NextPart_001_01C34A48.3248690D--

------_=_NextPart_000_01C34A48.3248690D
Content-Type: application/octet-stream;

name="dbf_reader.c"
Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment;

filename="dbf_reader.c"

#include <sys/types.h>
#include <sys/stat.h>
#include <sys/param.h>
#include <sys/signal.h>
#include <sys/fault.h>
#include <sys/syscall.h>
#include <sys/procfs.h>
#include <dirent.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <pwd.h>
#include <math.h>
#include <time.h>
#include <termios.h>
#include <signal.h>
#include <ctype.h>
#include <netdb.h>

/** Wrote by Stephen Lee using info from asktom.oracle.com **/

/** To use this run: this_prog [options] < file.dbf > textfile **/

/** options are: -p to show progress
*                -v verbose
*                -h read and print header info only.

**/

double power(double a, int b) {

	double out;
	int i;

	out =3D 1;
	for (i =3D 1; i <=3D b; ++i) out *=3D a;
	return out;

}

double to_double(char*, int, int);

/***************** BEGIN MAIN ******************/

int main (int argc, char **argv, char **envp) {

	struct field_descriptor {
		char name[12];
		char type;    /* data type of the field */
		int length;   /* 1 byte length */
		int decimals; /* 1 byte scale */
	};

	struct dbf_header {
		char version;      /* dbase version number */
		int year;			/* 1 byte int year, add to 1900 */
		int month;        /* 1 byte month */
		int day;			 /* 1 byte day */
		double no_records; /* 4 byte, number of records in file */
		double hdr_len;    /* 2 byte, length of header */
		double rec_len;    /* 2 byte, number of bytes per record */
		double no_fields;  /* number of fields per record */
	};


	int in, out, i, j, k, start, len, diddle, doink, percent, =
deleted_records;
	int progress =3D 0, verbose =3D 0, header =3D 0;
	char read_buf[16384], diddler;
	double bytes_read, record_count;

	int l_hdr_size =3D 32;
	int l_field_desc_size =3D 32;

	struct dbf_header p_hdr;
	struct field_descriptor p_flds[256];
	int big_endian =3D 1, offset;

	for(i =3D 1; i < argc; ++i) {
		/** if( (strcmp(argv[i], "-v") !=3D 0) && (strcmp(argv[i], "-p")) =
!=3D 0 ) { **/
		if( (strcmp(argv[i], "-v") & strcmp(argv[i], "-p") & strcmp(argv[i], =
"-h")) !=3D 0 ) {
			fprintf(stderr, "\nThere was an invalid option on the command =
line.\n");
			fprintf(stderr, "USAGE: %s [-p [ -v [ -h ]]] < input.dbf > =
output\n", argv[0]);
			fprintf(stderr, "      -p =3D show progress\n");
			fprintf(stderr, "      -v =3D verbose\n");
			fprintf(stderr, "      -h =3D print header info only, then =
exit.\n");
			return 1;
		}
		if( (strcmp(argv[i], "-v") =3D=3D 0) ) verbose =3D 1;
		if( (strcmp(argv[i], "-p") =3D=3D 0) ) progress =3D 1;
		if( (strcmp(argv[i], "-h") =3D=3D 0) ) { header =3D 1; verbose =3D 1; =
}
	}

	/** We read from stdin, write to stdout **
	*** We don't open no stinkin' files.    **/
	in =3D dup(0);
	out =3D dup(1);

	/** Read in the info section of the header, 12 bytes. **/
	if( (bytes_read =3D read(in, read_buf, 12)) !=3D 12) {
		fprintf(stderr,"\nBROKE: bytes read is not 12\n");
		fprintf(stderr,"   Got bytes_read =3D %lf\n", bytes_read);
		return 1;
	}

	p_hdr.version =3D read_buf[0];
	p_hdr.year =3D 1900 + (read_buf[1] < 0 ? read_buf[1] + 256 : =
read_buf[1]);
	p_hdr.month =3D read_buf[2] + 0;
	p_hdr.day =3D read_buf[3] + 0;
	p_hdr.no_records =3D to_double(read_buf + 4, big_endian, 4);
	p_hdr.hdr_len =3D to_double(read_buf + 8, big_endian, 2);
	p_hdr.rec_len =3D to_double(read_buf + 10, big_endian, 2);
	p_hdr.no_fields =3D (int)((p_hdr.hdr_len - =

l_hdr_size)/(l_field_desc_size));

	if(verbose) {
		fprintf(stderr, "version:    %c\n", p_hdr.version);
		fprintf(stderr, "year:       %u\n", p_hdr.year);
		fprintf(stderr, "month:      %u\n", p_hdr.month);
		fprintf(stderr, "day:        %u\n", p_hdr.day);
		fprintf(stderr, "no_records: %.0lf\n", p_hdr.no_records);
		fprintf(stderr, "hdr_len:    %.0lf\n", p_hdr.hdr_len);
		fprintf(stderr, "rec_len:    %.0lf\n", p_hdr.rec_len);
		fprintf(stderr, "no_fields:  %.0lf\n", p_hdr.no_fields);
	}

	/** Read past the rest of the header **/
	bytes_read +=3D read(in, read_buf, l_hdr_size - 12);
	if( ! bytes_read =3D=3D l_hdr_size ) {
		fprintf(stderr,"\nBROKE: bytes read is not equal to header size\n");
		fprintf(stderr,"   Got bytes_read =3D %lf\n", bytes_read);
		fprintf(stderr,"   And l_hdr_size =3D %d\n", l_hdr_size);
		return 1;
	}

	for(i =3D 0; i < p_hdr.no_fields; ++i) {
		bytes_read +=3D read(in, p_flds[i].name, 11); p_flds[i].name[11] =3D =
0;
		bytes_read +=3D read(in, read_buf, l_field_desc_size - 11);
		p_flds[i].type =3D read_buf[0];
		p_flds[i].length =3D read_buf[5] <=3D 0 ? read_buf[5] + 256: =
read_buf[5];
		p_flds[i].decimals =3D read_buf[6] < 0 ? read_buf[6] + 256 : =
read_buf[6];
	}

	if(verbose) {
		for(i =3D 0; i < p_hdr.no_fields; ++i) {
			fprintf(stderr, "NAME:%s\n", p_flds[i].name);
			fprintf(stderr, "TYPE:%c\n", p_flds[i].type);
			fprintf(stderr, "LENGTH:%u\n", p_flds[i].length);
			fprintf(stderr, "DECIMALS:%u\n", p_flds[i].decimals);
			fprintf(stderr, "--------------------\n");
		}
		fprintf(stderr, "Header bytes read =3D %.0lf\n", bytes_read);
	}

	if( p_hdr.hdr_len - bytes_read < 0 ) {
		fprintf(stderr, "\nBROKE: Bytes read is greater than header =
length.\n");
		return 1;
	}

	/** If reading only the header, then exit here **/
	if(header) return 0;

	if( p_hdr.hdr_len - bytes_read > 0 )
		read( in, read_buf, p_hdr.hdr_len - bytes_read ); /* Skip some bytes =

	bytes_read =3D 0;
	doink =3D percent =3D deleted_records =3D 0;
	diddler =3D 124;
	while( (j =3D read(in, read_buf, p_hdr.rec_len)) =3D=3D p_hdr.rec_len =

) {

		// Deleted records are marked by a "*"
		if( *read_buf =3D=3D '*' ) {
			++deleted_records;
			continue;
		}

		if( ++doink =3D=3D 1000 ) {
			record_count +=3D doink;
			if(progress) {
				percent =3D (int)(record_count / p_hdr.no_records * 100 + 0.5);
				fprintf(stderr, "\r%d ", percent);
				if( diddler =3D=3D 124 ) { fprintf(stderr,"%c%c",8,47); diddler =3D =
47; }
				else if( diddler =3D=3D 47 ) { fprintf(stderr, "%c%c",8,45); =
diddler =3D 45; }
				else if( diddler =3D=3D 45 ) { fprintf(stderr, "%c%c",8,92); =
diddler =3D 92; }
				else if( diddler =3D=3D 92 ) { fprintf(stderr, "%c%c",8,124); =
diddler =3D 124; }
			}
			doink =3D 0;
		}
		bytes_read +=3D j;
		offset =3D 1;
		for(i =3D 0; i < p_hdr.no_fields; ++i) {

			/** Remove leading spaces **/
			for( start =3D 0; start < p_flds[i].length; ++start )
				if( *(read_buf + offset + start) > ' ' && *(read_buf + offset + =

start) <=3D '~' ) break;

			/** Remove trailing spaces **/
			for( len =3D p_flds[i].length; len > 0; --len )
				if( *(read_buf + offset + len - 1) > ' ' && *(read_buf + offset + =

len) <=3D '~' ) break;

write(out, "\"", 1);

			/** This one writes with leading and trailing spaces removed **/
//			write(out, read_buf + offset + start, len - start );

			/** When the character is a '"', need to write two of these together =
to make sql loader happy **/
			for(k =3D 0; k < len - start; ++k) {
				if(*(read_buf + offset + start + k) =3D=3D '"') write(out, "\"", =
1);
				write(out, read_buf + offset + start + k, 1);
			}

			/** This one writes with leading and trailing spaces NOT removed **/
//			write(out, read_buf + offset, p_flds[i].length);

			write(out, "\"", 1);

		/** Here is the record separator character for sql loader **/
			if(i + 1 < p_hdr.no_fields) write(out, "::", 1);
			offset +=3D p_flds[i].length;
		}
		write(out, "\n", 1);
	}

	if(verbose) {
		fprintf(stderr, "\n\n");
		fprintf(stderr, "TOTAL RECORDS =3D %.0lf\n", =
bytes_read/p_hdr.rec_len);
		fprintf(stderr, "DELETED RECORDS =3D %d\n", deleted_records);
		fprintf(stderr, "LEFT OVER BYTES =3D %d\n", j);
	}

	return 0;

=09
}

/***************** END MAIN ******************/

/***************** BEGIN TO_INT ******************/

double to_double( char* p_data, int big_endian, int length ) {

	double l_number;
	int i;

	l_number =3D 0;

	/* The way in which the box handles one-byte integer math, *

required the use of the trinary operator biz. */

if(big_endian) { Received on Mon Jul 14 2003 - 15:40:43 CDT