RE: Tuning large aggregation query

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Tue, 15 Apr 2014 08:26:46 +0000
Message-ID: <CE70217733273F49A8A162EE074F64D901DE8B85_at_exmbx05.thus.corp>



Thomas,

The easy one first - the commas between the hints are there because I was writing them as a list in a simple sentence in English and wasn't thinking of the explicit construction of the final hint set.  Ignore them.

Getting a hash join is, at least, a step in the right direction; but we need to know what the distribution method is, and how the outline defines the distribution method.  Can you show us the execution plan pulled from memory, or from the plan_table, when you get a hash join, and include the 'outline' option in the call to dbms_xplan.  (Let's have the predicate section as well).



Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle

________________________________________
From: Thomas Kellerer [thomas.kellerer_at_mgm-tp.com]
Sent: 14 April 2014 14:06
To: Jonathan Lewis
Subject: Re: Tuning large aggregation query

Jonathan,

thanks for your answer, and please excuse me for answering this late.

Unfortunately I could not get your execution plan using the hints you suggested.

I can force a hash join instead of the nested loops, but that doesn't improve things either.

Btw: what exactly does the comma in the hints do?

"leading (t1 t2), use_hash(t2)" generates a completely different execution plan than "leading (t1 t2) use_hash(t2)"

I couldn't find anything in the manual regarding the meaning of a comma between two hints.

Thanks again
Thomas Kellerer

Jonathan Lewis, 09.04.2014 07:46:
>
> If you go back temporarily to the select and ignore the merge, I think you need a plan that looks something like this:
>
> --------------------------------------------------------------------------------------------------------------------------------------------------------
> | Id  | Operation                              | Name     | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     | Pstart| Pstop |    TQ  |IN-OUT| PQ Distrib |
> --------------------------------------------------------------------------------------------------------------------------------------------------------
> |   0 | SELECT STATEMENT                       |          | 49589 |  1743K|       |  4135   (3)| 00:00:21 |       |       |        |      |            |
> |   1 |  PX COORDINATOR                        |          |       |       |       |            |          |       |       |        |      |            |
> |   2 |   PX SEND QC (RANDOM)                  | :TQ10002 | 49589 |  1743K|       |  4135   (3)| 00:00:21 |       |       |  Q1,02 | P->S | QC (RAND)  |
> |   3 |    HASH GROUP BY                       |          | 49589 |  1743K|    51M|  4135   (3)| 00:00:21 |       |       |  Q1,02 | PCWP |            |
> |   4 |     PX RECEIVE                         |          | 49589 |  1743K|       |  4135   (3)| 00:00:21 |       |       |  Q1,02 | PCWP |            |
> |   5 |      PX SEND HASH                      | :TQ10001 | 49589 |  1743K|       |  4135   (3)| 00:00:21 |       |       |  Q1,01 | P->P | HASH       |
> |   6 |       HASH GROUP BY                    |          | 49589 |  1743K|    51M|  4135   (3)| 00:00:21 |       |       |  Q1,01 | PCWP |            |
> |*  7 |        HASH JOIN                       |          |  1122K|    38M|       |  3142   (2)| 00:00:16 |       |       |  Q1,01 | PCWP |            |
> |   8 |         BUFFER SORT                    |          |       |       |       |            |          |       |       |  Q1,01 | PCWC |            |
> |   9 |          PART JOIN FILTER CREATE       | :BF0000  | 70129 |  1369K|       |  2561   (1)| 00:00:13 |       |       |  Q1,01 | PCWP |            |
> |  10 |           PX RECEIVE                   |          | 70129 |  1369K|       |  2561   (1)| 00:00:13 |       |       |  Q1,01 | PCWP |            |
> |  11 |            PX SEND PARTITION (KEY)     | :TQ10000 | 70129 |  1369K|       |  2561   (1)| 00:00:13 |       |       |        | S->P | PART (KEY) |
> |  12 |             TABLE ACCESS BY INDEX ROWID| T1       | 70129 |  1369K|       |  2561   (1)| 00:00:13 |       |       |        |      |            |
> |  13 |              INDEX FULL SCAN           | T1_UK    | 70129 |       |       |   236   (2)| 00:00:02 |       |       |        |      |            |
> |  14 |         PX PARTITION LIST JOIN-FILTER  |          |  1122K|    17M|       |   575   (3)| 00:00:03 |:BF0000|:BF0000|  Q1,01 | PCWC |            |
> |  15 |          TABLE ACCESS FULL             | T2       |  1122K|    17M|       |   575   (3)| 00:00:03 |:BF0000|:BF0000|  Q1,01 | PCWP |            |
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
>
> I think I've emulated your requirement:  (on 11.2.0.4): t1 is not partitioned by has a unique index starting with the "region" column;
> t2 is list partitioned by "region" an there's a massive skew in the region with two very popular regions (separate partitions)
> and about 20 other regions in the default partition.
>
> The critical step is the PX SEND PARTITION (key) which I got by hinting the pq_distribute().
> My hints were:  leading (t1 t2), use_hash(t2) no_swap_join_inputs(t2), pq_distribute(t2 partition none)
>
> The last hint effectively tells the optimizer to distribute the t1 data based on
> the partition key of the t2 data so that the t2 data doesn't have to be distributed.
>
>
> Regards
> Jonathan Lewis
> http://jonathanlewis.wordpress.com
> _at_jloracle
--
http://www.freelists.org/webpage/oracle-l Received on Tue Apr 15 2014 - 10:26:46 CEST

Original text of this message