I have a "Legacy" data conversion project. I've done this sort of thing in
the past using SQL scripts, but this particular project appears to be a bit
more complex and I'm considering some alternate design ideas. Below is the
basic info.
Thanks for any and all input.
Conversion Design Plan
Constraints/Requirements:
- Source data in flat file layout (46 columns).
- Destination data in relational layout with 5 (maybe more) tables.
- Raw import of source data into Oracle table complete (raw varchar2
columns).
- Initial source conversion of Date and numeric columns complete.
- Initial conversion/load of records into 1 top level table complete
(using SQL script).
- Primary key (sequence) numbers must be maintained between destination
tables. Destination tables have no referential integrity constraints defined
so conversion process will be responsible for this.
- Source record count approx 400k rows. Destination rows will be approx
1.2 million all tables combined.
- Invalid source data must be written out to error table for review and
possible remediation.
- Conversion process is planned to run ONCE, and run time is not critical.
If any problems are discovered, then the conversion code may be revised and
rerun.
- Destination tables have between 10 and 40 columns.
- Source data is NOT being used by ANY other application, so concurrency
issues "should not" be a problem.
Next Step:
- Evaluate complex if/then/decode logic to convert source data into 4 (or
more) remaining destination tables.
Options/Questions:
- Continue using SQL Script. This means massive/complex SQL with
many/nested DECODEs
Advantages:
- SQL "should" be faster than procedural code.
Disadvantages:
- Getting this SQL working and verified to load each destination is not
easy.
- Misplacing a single ( or , in this SQL could take an hour to find.
- Maintaining this SQL will be difficult due to sheer size and
complexity.
- What is Oracle's max SQL string size?
- SQL could take hours to run, causing waits before results can be
verified.
Questions:
- Would you use SQL? Why?
- Use PL/SQL code to load each (or one) source row and process into all
destination tables.
Advantages:
- each column conversion step will be isolated, so maintainability is
better.
- can process small groups of records, avoid waits before verifying
results.
- could break conversion steps into separate procedures to reduce
complexity, ease maintenance.
- can write invalid rows into error table as each row is being
processed.
Disadvantages:
- Slower than SQL?
Questions:
- I have a plan to commit after each source row is processed into the
destination tables by inserting its key into a "processed" table. This could
allow the conversion to start at the first failed row and avoid starting
over from the beginning. Does this sound like a good idea?
- How to avoid rollback segment overrun?
As you can see, I'm leaning toward PL/SQL but I would like to hear other
ideas/opinions?
Thanks,
Oscar Bowyer
Received on Tue Aug 05 2003 - 17:28:42 CDT