Re: Why does Oracle buffer the result from the select for a parallel INSERT INTO SELECT FROM?

Home -> Community -> Usenet -> c.d.o.server -> Re: Why does Oracle buffer the result from the select for a parallel INSERT INTO SELECT FROM?

From: Jaap W. van Dijk <j.w.vandijk.removethis_at_hetnet.nl>
Date: Wed, 22 Nov 2006 14:14:00 GMT
Message-ID: <456454e9.231906@news.hetnet.nl>

On 22 Nov 2006 03:54:40 -0800, "sybrandb" <sybrandb_at_gmail.com> wrote:

>
>
>On Nov 22, 11:23 am, j.w.vandijk.removet..._at_hetnet.nl (Jaap W. van
>Dijk) wrote:
>> Hi,
>>
>> I'm on Oracle 9.2.0.5, Open VMS.
>>
>> I perform an
>>
>> INSERT INTO target
>> SELECT * FROM source
>>
>> If I do this noparallel, the result from the select is immediately
>> inserted into the target.
>>
>> If I do this in parallel, the result of the selects from a slave set
>> are first buffered in TEMP. After this is done, a slave set starts to
>> insert the contents from the buffers into the target.
>>
>> Why this timeconsuming buffering? Can anything be done?
>>
>> While I am writing this, I wonder: I let Oracle decide which degree of
>> parallellization to use. Maybe buffering only takes places if the
>> degree differs between SELECT and INSERT? (I should test myself, but
>> don't have the opportunity right now).
>>
>> Regards, Jaap.
>
>IMHO parallelization is pretty useless if the affected tables aren't
>striped over several disks.
>I remember Tom Kyte stating somewhere you should use parallelization
>only if you can't get rid of full table scans at all.
>Also parallelization always will use parallel server slaves and a query
>coordinator (the original session), so I bet this is what you are
>looking at right now: data is retrieved by slaves and inserted by the
>coordinator.
>And yes, this is a 'feature', while you seem to qualify it as a bug.
>
>--
>Sybrand Bakker
>Senior Oracle DBA
>

Maybe the coordinator wants to coordinate between select and insert , but it doesn't insert the records itself: that is also done by a slave set.

I took a look at the Concept Manual. There inter-operation parallellism is discussed, where data flows directly from one slave set to the next (fig 18-3 on page 18-7). What is pictured there - two slave sets doing a select and a sort - could IMO as easily have been two slave sets doing a select and an insert, without any buffering. As I said before: maybe the optimizer needs to know beforehand that the degree of select and insert is the same, so when I have the opportunity, I perform a test.

Regarding the parallel gain: selecting also takes an amount of CPU, and parallellizing that improves the select noticably, even if everything is read from the same disk, especially if the select is part of the transformation step of the load of a datawarehouse, with lots of complex case statements in the select list, consuming lots of CPU. Regards, Jaap. Received on Wed Nov 22 2006 - 08:14:00 CST