RE: Flat file generation integrity ideas...

Chris - Your comment about the other systems maybe sending you the wrong file sparked an idea. XML! Yeah, it has been overhyped, but the basic idea is that the file specifies the format. So you could get a file and ensure the content conforms to your expectation. I am an XML novice, but I believe you can create a DTD that defines the valid contents of an XML document and then use that to validate the XML files before you load them into Oracle. The biggest downside is that it increases the file size, maybe double. And you have to convince the other systems to use XML.

Well, it's kinda like saying your backups will never fail. The backup script works now, and it has for many months, so I should never worry. Right?  

Well I'm paranoid, and Andy Groove is my god. So I have to put in place some checks to make sure the data is ok, before I load 1,000s of records into my database that could corrupt it.  

Yes, I can implement all the suggestions, but I also have to consider the possibility that one of these other systems is sending me a file that is just plain wrong, or a file meant for a different system, or an old file.  

Thanks everyone!!!

I do not see how the file can get "scrambled". You write it out ok.
The ftp is guaranteed.
So what is the problem.  

I will go along with the suggestion to zip it. It saves on the ftp time and also gives you some protection.  

Yechiel Adar

I have to create packages that will generate several flat files of data from tables that will be sent to other systems to be processed.

I am looking for ideas on how to ensure data integrity in the flat files.

For example, the expected record count is stored on the first line of the file to ensure that the correct amount of records was received.

The systems group is chartered to ensure the flat files are correctly FTPed between systems, so that's covered.

I just worry that if "somehow" a flat file is scrambled then the scrambled data is loaded into the database, therefore corrupting it.

At this phase, XML is not an option

I keep thinking that some sort of CRC should be stored with each line in the flat file. And then before the line is loaded into the database, the CRC is compared against the generated CRC of the just read line. Has anyone done anything like this? Any examples out there?

Many TIA!!


