Re: BULK COLLECT and QUERY REWRITE

From: DA Morgan <damorgan_at_psoug.org>
Date: Tue, 31 Oct 2006 09:04:39 -0800
Message-ID: <1162314277.914284@bubbleator.drizzle.com>

chris wrote:
> DA Morgan wrote:

>> chris wrote:
>>> DA Morgan wrote:
>>>> chris wrote:
>>>>> I understand that Oracle 10g will rewrite your CURSOR FOR LOOP
>>>>> statements into a BULK COLLECT statement.
>>>>>
>>>>> I am contemplating no longer explicitly writing the BULK COLLECT from
>>>>> now on as it reduces the number of lines of code and greatly simplifies
>>>>> code.
>>>>>
>>>>> Can anyone see any serious flaws in this strategy?
>>>>>
>>>>> Kind Regards
>>>>> Chris
>>>> Bad idea. With bulk collect you control the LIMIT clause. 10g
>>>> defaults to 100 which is a non-optimal compromise.
>>>>
>>>> And how does not using BULK COLLECT simplify the code?
>>>>
>>>> FETCH r              INTO l_data
>>>> is simpler than
>>>> FETCH r BULK COLLECT INTO l_data LIMIT 1000;
>>>>
>>>> in what way? 21 characters?
>>>>
>>>> --
>>>> Daniel A. Morgan
>>>> University of Washington
>>>> damorgan_at_x.washington.edu
>>>> (replace x with u to respond)
>>>> Puget Sound Oracle Users Group
>>>> www.psoug.org
>>> In my post I described a strategy using the CURSOR FOR LOOP as an
>>> alternative to the BULK COLLECT, this is very different from your
>>> example.  All you have shown here is an explicit CURSOR FETCH specified
>>> to use BULK COLLECT.
>>>
>>> The BULK COLLECT statement requires additional coding to handle
>>> returning more rows than the specified LIMIT clause.  Using the FOR
>>> LOOP CURSOR removes this requirement and any explicit OPEN and CLOSE
>>> cursor therefore simplifying the code.
>>>
>>> Regards,
>>> Chris
>> Both require a loop. Look at this:
>>
>> CREATE OR REPLACE PROCEDURE slow_way IS
>>
>> BEGIN
>>    FOR r IN (SELECT * FROM parent)
>>    LOOP
>>      -- modify record values
>>      r.part_num := r.part_num * 10;
>>      -- store results
>>      INSERT INTO child
>>      VALUES
>>      (r.part_num, r.part_name);
>>    END LOOP;
>>    COMMIT;
>> END slow_way;
>> /
>>
>> exec slow_way
>> -- This is your idea ... it takes 02.30 seconds to execute.
>> -- This is the same with BULK COLLECT
>>
>> CREATE OR REPLACE PROCEDURE fast_way IS
>>
>> TYPE myarray IS TABLE OF parent%ROWTYPE;
>> l_data myarray;
>>
>> CURSOR r IS
>> SELECT part_num * 10, part_name
>> FROM parent;
>>
>> BEGIN
>>    OPEN r;
>>    LOOP
>>      FETCH r BULK COLLECT INTO l_data LIMIT 1000;
>>
>>      FORALL i IN 1..l_data.COUNT
>>      INSERT INTO child VALUES l_data(i);
>>
>>      EXIT WHEN r%NOTFOUND;
>>    END LOOP;
>>    COMMIT;
>>    CLOSE r;
>> END fast_way;
>> /
>>
>> exec fast_way -- It executes in 00.18
>>
>> Exactly how much extra work has to be done to improve performance
>> by more than 90%?
>>
>> Excuse me please for being a bit harsh but are you being paid
>> to limit the number of characters typed or to write the best
>> performing most scalable code you can?
>>
>> Do you want your physician to take the attitude of least work
>> rather than best work? Your accountant? The engineers at Boeing?
>> Your attorney? Your children? It doesn't make any sense to me.
>> --
>> Daniel A. Morgan
>> University of Washington
>> damorgan_at_x.washington.edu
>> (replace x with u to respond)
>> Puget Sound Oracle Users Group
>> www.psoug.org

>
> Your example is a valid one as it highlights an issue depending on how
> you want to process the results of a BULK COLLECT call.
>
> I performed the same test as yourself and certainly got a better
> performance with the fast_way procedure.
>
> FAST_WAY: 00.35
> SLOW_WAY: 01.23
>
> However this type of processing (SELECT then UPDATE) is not common in
> our application.
>
> Performing another set of tests, but this time with no INSERT statement
> but just the population of an Associative Array (Index Table), produced
> the following results on a table of 1001 rows with a shared_pool flush
> in between:-
>
> FAST_WAY2: 00.07
> SLOW_WAY2: 00.09
>
> In our application when we do require SELECT then UPDATE then our
> initial strategy is to push this down to the DB Engine in SQL i.e.
>
> INSERT INTO child
> (part_num,
> part_name)
> SELECT part_num * 10,
> part_name
> FROM PARENT;
>
> This implementation executed in 00.04
>
>

>>> ...being paid to limit the number of characters typed

> I am being paid to write good code. Therefore the code must be
> scaleable, maintainable and conform with our coding standards.
> Writing less code definitely helps assist in keeping code maintainable,
> however it's all about compromises, if less code renders the code
> un-scaleable then you would rethink your strategy.
>
> I've yet to see any strong argument why using the FOR CURSOR in our
> application would compromise the performance of the system.

Use FORALL with your INSERT.

And don't just look at the time of execution. Look at the use of resource, CPU, I/O, waits, latches, etc.

Your test on a single user system does not represent production reality. And, I really don't care what you write, I've no doubt it can be made more efficient with array processing.

-- 
Daniel A. Morgan
University of Washington
damorgan_at_x.washington.edu
(replace x with u to respond)
Puget Sound Oracle Users Group
www.psoug.org

Received on Tue Oct 31 2006 - 11:04:39 CST