Re: Tuning large aggregation query
From: Mohamed Houri <mohamed.houri_at_gmail.com>
Date: Fri, 4 Apr 2014 16:13:59 +0200
Message-ID: <CAJu8R6inOKthRo4RTVBGQeWbomamT1WvAz9TObHtyzaZVtFwtA_at_mail.gmail.com>
I can see in the plan that there are several operations that have not been started at all
Date: Fri, 4 Apr 2014 16:13:59 +0200
Message-ID: <CAJu8R6inOKthRo4RTVBGQeWbomamT1WvAz9TObHtyzaZVtFwtA_at_mail.gmail.com>
I can see in the plan that there are several operations that have not been started at all
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | ---------------------------------------------------------------------------------------------------------------------------------- | 0 | MERGE STATEMENT |
| 0 | | 0 |00:00:00.01 |
| 1 | MERGE | TARGET_TABLE | 0 | | 0 |00:00:00.01 | | 2 | PX COORDINATOR |
| 0 | | 0 |00:00:00.01 |
| 3 | PX SEND QC (RANDOM) | :TQ10002 | 0 | 4061K| 0 |00:00:00.01 | | 4 | VIEW |
| 0 | | 0 |00:00:00.01 |
| 5 | NESTED LOOPS |
| 0 | 4061K| 0 |00:00:00.01 |
| 6 | NESTED LOOPS |
| 0 | 4061K| 0 |00:00:00.01 |
| 7 | VIEW |
| 0 | 3900K| 0 |00:00:00.01 |
| 8 | SORT GROUP BY |
| 0 | 3900K| 0 |00:00:00.01 |
| 9 | PX RECEIVE |
| 0 | 3900K| 0 |00:00:00.01 |
| 10 | PX SEND HASH | :TQ10001 | 0 | 3900K| 0 |00:00:00.01 | | 11 | NESTED LOOPS |
| 1 | 3900K| 4135K|00:05:09.10 |
| 12 | NESTED LOOPS |
| 1 | 3900K| 4135K|00:02:26.62 |
| 13 | BUFFER SORT |
| 1 | | 229K|00:01:29.17 |
| 14 | PX RECEIVE |
| 1 | | 229K|00:01:28.21 |
| 15 | PX SEND ROUND-ROBIN | :TQ10000 | 0 | | 0 |00:00:00.01 | | 16 | TABLE ACCESS BY GLOBAL INDEX ROWID| BASE_TABLE | 0 | 3900K| 0 |00:00:00.01 | | 17 | INDEX FULL SCAN | PK_BASE_TABLE
| 0 | 3900K| 0 |00:00:00.01 |
| 18 | PARTITION LIST ITERATOR | | 229K| 1 | 4135K|00:00:48.82 | |* 19 | INDEX RANGE SCAN | IDX_SALES_DATA | 229K| 1 | 4135K|00:00:36.71 | | 20 | TABLE ACCESS BY LOCAL INDEX ROWID | SALES_DATA | 4135K| 1 | 4135K|00:02:21.94 | | 21 | PARTITION LIST ITERATOR |
| 0 | 1 | 0 |00:00:00.01 |
|* 22 | INDEX RANGE SCAN | IDX_T1704_ADVSTOREITEM | 0 | 1 | 0 |00:00:00.01 | |* 23 | TABLE ACCESS BY LOCAL INDEX ROWID | TARGET_TABLE | 0 | 1 | 0 |00:00:00.01 | ----------------------------------------------------------------------------------------------------------------------------------
All those operations signaled in this color have not been started. For the remaining operations, operation 20 seems to be the most consuming one.
Have you a plan with a predicate part?
What is the definition of the index IDX_SALES_DATA. The CBO is doing a wrong estimation when accessing this index which might lead to a wrong since it looks like the starting operation
Best regards
Mohamed Houri
www.hourim.wordpress.com
2014-04-04 16:02 GMT+02:00 Kenneth Naim <kennethnaim_at_gmail.com>:
> What wait events are seeing when the query is running? How many columns > are in each of the tables? > > -----Original Message----- > From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] > On Behalf Of Thomas Kellerer > Sent: Friday, April 04, 2014 9:12 AM > To: oracle-l_at_freelists.org > Subject: Tuning large aggregation query > > Hi, > > for reporting purposes we are maintaining some denormalized, > pre-aggregated tables in our OLTP database. > > These queries are aggregating sales and other related data from several > source tables into one aggregation table. > > Our problem is that the aggregation of some 70 million rows into approx. 4 > million rows is quite slow. > > As far as I can tell, most of the time is lost when doing the actual > aggregation. > The join itself is quite fast, but aggregating 70 million rows needs a > huge amount of memory (or temp-space) > > I uploaded to the execution plan to pastebin to preserve formatting: > http://pastebin.com/XfEvB3z9 (You need to disable word-wrapping though) > > The basic structure of the statement looks like this: > > select base_table.pk_col_1, > base_table.pk_col_2, > base_table.pk_col_3, > sum(....), > sum(...), > min(...), > .... and another approx. 100 columns ... > from base_table > join sales_data on ... > group by base_table.pk_col_1, > base_table.pk_col_2, > base_table.pk_col_3 > > There is a 1:N relation between base_table and sales_data. > > Sales_data essentially stores sales information for products on a per-day, > sale and store level. > The goal of the aggregation is to sum up that daily information to a > product/store level. > > sales_data contains about 70 million rows, base_table contains about 4 > million rows, so the result of the aggregation is again 4 million rows. > sales_data is partitioned by the region in which the stores are located > (and that partitioning cannot be changed). > > We applied the same partitioning scheme to base_table and target_table in > the hope that this will make the parallel execution more efficient, but > changing from unpartitioned to partitioned didn't improve the queries (but > didn't make them worse either). We tried various indexes (in additioin to > the PK index that is already used for the JOIN). But apparently for the > type of query there isn't much an index can do. > > The main time is spent doing the aggregation. > I confirmed that by only running the select statement that's inside the > merge and that takes nearly as long as the merge itself. > > We were able to improve the whole process a bit by using parallel > execution, but the runtimes are still way too high for the "window" we have > in the night to prepare the data. The target would be that a single merge > like that runs in about 10 minutes, currently it's around 60minutes. > > The test server runs Oracle 11.2.0.4 on a Windows 2008R2 server with 64GB > RAM, a 8 spindle RAID 10 attached and 6 cores (12 CPUs). > > I can provide more Oracle configuration parameters if needed. > > The production server also runs Oracle 11.2.0.4 on Windows Server 2008R2, > has 32 cores (64CPUs), 192GB RAM But we will not be able to test this on > the real production hardware - and the number of rows in production will > probably be 4-5 times bigger. > > > I am looking for ideas on how to improve the runtime of those statements. > Just some pointers on what things I could try out to either optimize the > query itself or tune the Oracle configuration > > Thanks > Thomas > > > > -- > http://www.freelists.org/webpage/oracle-l > > > -- > http://www.freelists.org/webpage/oracle-l > > >
-- Bien Respectueusement Mohamed Houri -- http://www.freelists.org/webpage/oracle-lReceived on Fri Apr 04 2014 - 16:13:59 CEST