Re: PDML and PQ tunning

From: Radoulov, Dimitre <cichomitiko_at_gmail.com>
Date: Sun, 20 Nov 2011 12:08:44 +0100
Message-ID: <4EC8DFBC.5060709_at_gmail.com>


Hi Dave,
thanks for replying!

Details in-line.

On 19/11/2011 17:36, Herring Dave - dherri wrote:
>
> As for your specific situation, you haven't shared a lot of detail.

I know, I did that on purpose because I suspect there is bug involved so I wanted to share only what seemed sufficient in order to briefly describe the issue.
For a full blown analysis I'll need to share everything.

> For PQs, are the underlying tables large enough to warrent parallelization? Smaller tables can actually run slower with parallelization, due to all the overhead of starting x number of PX slaves, dividing up the workload, communication between PX slaves and the QC, then producing the results and cleanup. "Smaller" is obviously relative but from what I've seen in working on DWs, any table smaller than 150 MB to 200 MB usually should be left at degree 1.

Yes, this is a kind of *small* dwh environment (the whole database is about 200g) running on old machines.
The statements I'm currently analyzing join tables which are about 3 - 4g each, the joins are rather simple (mostly 2 tables at a time).

> Have you compared xplans between serial and parallel executions? Were the serial executions using full scans? Was more than 1 table joined and if so were both large and had a parallel degree set?

Yes, I've compared the plans and in the case I'm talking about both plans use the same access type (FTS).
Both the serial and the parallel plans indicate hash join with FTS, I've tried serial, parallel 2 and 4.
In all cases the serial execution was the fastest.

>
> As for PDML, were you working with CTAS operations? Or INSERT...SELECT? For the latter were you attempting to have the INSERT run in parallel or just the SELECT?

Insert as select, both insert and select parallelized. In most cases even the single select was slower with parallel degree 2 compared to the serial one.

>
> For both PQ and PDML, were the PX qref latch waits the highest of all waits? If not, what were the top 5 waits before and after the change for both types of operations?

Good questions. I don't have access to the trace files so for now extended SQL trace is not an option for now. I've used Tanel Poder's snapper script to take a *quick look* at the wait events and I've got the following result (just a small part, there is a single select statement running):



Active% | SQL_HASH_VAL | EVENT

    200% | 114550436 | PX qref latch

  • End of ASH snap 2, end=2011-11-18 11:24:45, seconds=20, samples_taken=19

Active% | SQL_HASH_VAL | EVENT

    200% | 114550436 | PX qref latch
      6% | 3463809334 | ON CPU

  • End of ASH snap 3, end=2011-11-18 11:25:05, seconds=20, samples_taken=18

Active% | SQL_HASH_VAL | EVENT

    200% | 114550436 | PX qref latch

  • End of ASH snap 4, end=2011-11-18 11:25:26, seconds=21, samples_taken=19

Active% | SQL_HASH_VAL | EVENT

    200% | 114550436 | PX qref latch

  • End of ASH snap 5, end=2011-11-18 11:25:46, seconds=20, samples_taken=19

I'll admit that I though this was a bug so I hoped to find it documented on MOS quickly and to not spend more time at this time. Now I believe that that's not the case (i.e. I didn't find anything useful for now), so next week I'll try to set up a complete test case and get back here with the details.

Thanks again
Dimire

--
http://www.freelists.org/webpage/oracle-l
Received on Sun Nov 20 2011 - 05:08:44 CST

Original text of this message