RE: Paralellizing Pl/sql inserts
Date: Wed, 23 Jan 2008 22:07:34 +0900
> Parallel dml doesn't work on insert values clauses
Oops. I missed your mention that you can't use insert ... select. :(
But I don't understand why you can't use insert ... select. I'm not sure about your business requirement, but there should be some way.
Here is a rule of thumb (you might already be familiar) non-bulk insert < bulk insert < insert select <insert /*+ append and/or parallel */ select
And reducing redo data is quite important. Parallel direct mode insertion with nologging and no online index management the best way ever known.
I think there could be a bunch of clever ideas to handle your problem. For instance, just dropping indexes before insertion and creating them with nologging and parallel option after insertion would be another simple solution. Online index management on massive DML is a real headache on performance.
And I don't think conversion from ASSM to FLM is good idea. FML is performing a little better than ASSM for massive DML, but there are too many side effect as you know already.
From: Ken Naim [mailto:kennaim_at_gmail.com] Sent: Wednesday, January 23, 2008 9:30 PM To: 'Ukja.dion'; oracle-l_at_freelists.org
Subject: RE: Paralellizing Pl/sql inserts
Parallel dml doesn't work on insert values clauses, I have used the merge statement before, but I thought it wouldn't be as fast as a bulk insert with trapping the errors as I don't have any updates but I hadn't considered parallelizing it. I'll have to benchmark it and it see.
From: Ukja.dion [mailto:ukja.dion_at_gmail.com] Sent: Tuesday, January 22, 2008 8:14 PM
To: kennaim_at_gmail.com; oracle-l_at_freelists.org Subject: RE: Paralellizing Pl/sql inserts
I would try parallel dml. (parallel merge statement)
Or prefer creating whole new table in parallel and exchanging it with original table.
In addition, db file sequential wait event means that you're busy
Dropping(or disable) indexes + parallel merge + parallel creating(or rebuilding) indexes would be ideal process.
Read following great discussion.
From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Ken Naim
Sent: Wednesday, January 23, 2008 9:04 AM To: oracle-l_at_freelists.org
Subject: Paralellizing Pl/sql inserts
I have a process that is currently running at a rate of 4 million inserts per hour. In selects from a cursor that returns 65 million records bulk collects 500 at a time and forall inserts them into another table with 1.5 billion rows. I cannot do an insert as select as the record may already exists. I use the save exceptions clause to allow the code to continue processing but I don't care about which record turn the unique constraint violation. The 1.2 billion row table is partitioned but the data can go into any partition. The 65 million row table is not partitioned
On smaller tables I would do a minus between the primary keys of both tables and use that as a filter on an insert as select.
The 65 million record cursor takes only 15 minutes to return all 65 million records, so 98.5% of the time is consumed by the inserts. I am going to move the table to non assm tablespace and increase its free lists but I believe these measures will only provide a marginal improvement. 33% of the elapsed time is waiting on cpu and the top wait events are db file sequential read (94%) and db file scattered read (6%)
The business requires this process to run in under 2 hours as it will run weekly (if not daily) so I need an order of magnitude increase in performance. The infrastructure (server/storage array etc.) is high end and shows little load during this process and this process will be the only one running on the box at that time.
I am open to any suggestions but my question is how can I parallelize the process into multiple (24-48) threads.