Re: Parallel queries

From: Greg Rahn <>
Date: Fri, 25 Apr 2008 12:24:49 -0700
Message-ID: <>

Are you using PDML? (alter session enable parallel dml;)

What is the logic of using 5 or 10 for DOP?

I would recommend an even number (powers of two, even better) for the DOP. If you have 8 cores and the OS sees 8 CPUs, you could just alter the table "parallel" w/o specifying a number for the DOP and it will default to CPU_COUNT * PARALLEL_THREADS_PER_CPU, which would be 16 in your case (8*2). I personally haven't used the Niagara processors so you may have to adjust it if it is too much.

Parallel execution uses a producer/consumer model. The "PX Deq Credit: send blkd" event means that some parallel execution slaves (producers) are blocked from sending more information because they are waiting for other slaves (consumers) to dequeue messages.

With that much data and only a few PQ slaves, it is likely that the hash join will spill to temp as well. You can try and set your pga_aggregate_target as high as possible to give more memory to the slaves. There is a practical limit to how much memory a single slave will consume before it spills to disk. I you desire to use more memory, you will need a higher DOP.

You may be bottlenecked on I/O bandwidth. If you are scanning the table and building the hash join, and writing the hash join to temp, etc. you are doing quite a bit of reading/writing. Might want to check iostat.

If I were you, I would ask for more information on why the software vendor thinks that using a smaller granule for the partition size would impact their performance and ask if they have test cases to support their claim. I'm not suggesting it is not a true statement, but I'd be interested more in the "show" and less in the "tell". Range partitioning by quarter seems to me to be a bit on the large size, given that I would suspect that many queries look at the data with much finer granularity. If there are frequent large table to large table joins, then subpartitioning on the join key could result in a partition-wise join and probably yield in better performance. There is a performance hit if the slaves have to break up a table or partition into granules compared to if they each just work on a subpartition. If you choose to pursue this, I would advise you to have the same number of hash subpartitions as the DOP and make it a power of two. For more details on this,see the PQ section of OOW2007-"When to Use the Appropriate Database Technology"

Best of luck.

On Fri, Apr 25, 2008 at 10:41 AM, Ken Naim <> wrote:
> I have an "Insert append nologging as select" query that joins a 71 million
> row table against a partition of a multi-billion row table which has 700
> million rows which is directly specified using the extended syntax for
> partitions. Result set should be around 200 million rows. The explain plan
> is simple, full scan of the heap table and partition then a hash join of the
> two followed by a has group by and finally the insert. My test box is a sun
> box with 1 processor and 8 cores. I am explicitly issuing a parallel of 5 of
> each table (reduced form 10 each with same symptoms) and the full scans and
> hash join(s) complete within 1 hour based on v$session_longops. After 5 of
> slaves go to an inactive status and the other 5 have PX deq credit: send blk
> with a p1text/p2text/p3text of sleeptime/senderid, passes, qref. My parallel
> message size is 16k. The stats tab in toad shows PX:local messages received
> of just over 1 million and a slowly increasing PX:local messages sent of
> 600,000 an hour and a half after the hash join finished. Cpu utilization is
> currently at 2-3% 97% idle.I have searched through Metalink, google and have
> read all the papers on parallel execution (including Doug Burn's Suck it
> Dry) and am not sure where to go from here. Any ideas, theories, hints or
> links would be appreciated.
> As a side note. I was shot down for sub-partitioning or repartitioning to a
> lower level by the application vendor as it would hurt the performance of
> their engine. Currently the table is partitioned by quarter based on sales
> date and partitions vary between 30 million and 700 million records for
> reasons too numerous to explain.


Greg Rahn
Received on Fri Apr 25 2008 - 14:24:49 CDT

Original text of this message