Paralellizing Pl/sql inserts
Date: Tue, 22 Jan 2008 19:03:43 -0500
I have a process that is currently running at a rate of 4 million inserts per hour. In selects from a cursor that returns 65 million records bulk collects 500 at a time and forall inserts them into another table with 1.5 billion rows. I cannot do an insert as select as the record may already exists. I use the save exceptions clause to allow the code to continue processing but I don't care about which record turn the unique constraint violation. The 1.2 billion row table is partitioned but the data can go into any partition. The 65 million row table is not partitioned
On smaller tables I would do a minus between the primary keys of both tables and use that as a filter on an insert as select.
The 65 million record cursor takes only 15 minutes to return all 65 million records, so 98.5% of the time is consumed by the inserts. I am going to move the table to non assm tablespace and increase its free lists but I believe these measures will only provide a marginal improvement. 33% of the elapsed time is waiting on cpu and the top wait events are db file sequential read (94%) and db file scattered read (6%)
The business requires this process to run in under 2 hours as it will run weekly (if not daily) so I need an order of magnitude increase in performance. The infrastructure (server/storage array etc.) is high end and shows little load during this process and this process will be the only one running on the box at that time.
I am open to any suggestions but my question is how can I parallelize the process into multiple (24-48) threads.