Optimizer assitance needed

From: Masterhit <industrialstr_at_yahoo.com>
Date: 3 Aug 2004 14:14:50 -0700
Message-ID: <f3b449ab.0408031314.761308bb@posting.google.com>

We have a table of ~ 1.25 billion rows which is very heavily inserted into (grows at over 40 million rows a week), never (yet) deleted from and is regularly queried. (I know we should look into partitioning this one probably .. but that's for a later discussion).

In the process of re-analyzed the table some things went 'bad'.

(deleted old stats 1st)
exec dbms_stats.gather_table_stats(ownname =>'IBS', tabname =>'SALES_DETAIL', method_opt =>'FOR COLUMNS TRANSACTION_TYPE SIZE 9', cascade =>TRUE, degree => 10, estimate_percent => 1) (Subsequently I started using .01 pct which is still 125K rows)

I did this in test first (where ethe table is ~1 billion rows). I checked my most common queries and batch processes for slowdowns and didn't find any. I then ran the *same* gather statement in production and found no problems.

Later the same day I got numerous complaint of 'slowness' and found that a certain query (below) from a remote system was using a full scan in production. I checked in test and it was using the PK index. Our temp space filled and things started getting really slow.

To my knowledge the data is not skewed on any of the columns except those I specified. Prod and test are within 20% of each other in size. The test system is weaker on horsepower (8 cpu vs 32 I think) and resources - but it strikes me as odd it would, given the same statistic gathering call produce the correct plan whereas PROD did not. (FYI the full scan takes many minutes as compared to the index range in TEST).

Location_id is a range of about 8000 values - roughly evenly distributed. Sales_date is also evenly distributed over the past 8 months and transaction type is skewed over 9 values (the bulk of the rows having 3 of the values)

I hinted the query in prod and it ran very fast. I then re-gathered in PROD with 'ALL INDEXED COLUMNS' (or obviously 'ALL COLUMNS SIZE AUTO') and the problem went away in prod (correct plan now).

The table: (PK is made up of the 6 NOT NULL columns - yes I wish they had made it a sequence... but oh well)

Name Null? Type

 LOCATION_ID                        NOT NULL NUMBER(5)
 PROCESS_DATE                       NOT NULL DATE
 SALES_DATE                         NOT NULL DATE
 TRANSACTION_TYPE                   NOT NULL VARCHAR2(10)
 SKU                                NOT NULL VARCHAR2(10)
 PRICE_POINT                        NOT NULL NUMBER(12,2)
 UNIT_CNT                                    NUMBER(12)
 EXT_COST_AMT                                NUMBER(14,4)
 EXT_RETAIL_AMT                              NUMBER(12,2)
 CREATE_SOURCE                               VARCHAR2(30)
 CREATE_DATE                                 DATE
 LAST_UPDATE_SOURCE                          VARCHAR2(30)
 LAST_UPDATE_DATE                            DATE

The Query:

SELECT various columns

       FROM sales_detail sd
      WHERE sd.location_id = 749
        AND sd.sales_date = '26-JUN-2004'
        AND sd.transaction_type = 'S'

GROUP BY location_id, sd.transaction_type

| Id  | Operation                    |  Name            | Rows  |

Bytes | Cost |

|   0 | SELECT STATEMENT             |                  |     1 |   
51 | 20387 |
|   1 |  SORT GROUP BY NOSORT        |                  |     1 |   
51 | 20387 |
|   2 |   TABLE ACCESS BY INDEX ROWID| SALES_DETAIL     | 99050 | 
4933K| 20387 |
|*  3 |    INDEX RANGE SCAN          | PK_SALES_DETAIL  | 39620 |

| 19780 |

Predicate Information (identified by operation id):

3 - access("SD"."LOCATION_ID"=749 AND "SD"."SALES_DATE"='2004-06-26' AND

              "SD"."TRANSACTION_TYPE"='S')
       filter("SD"."SALES_DATE"='2004-06-26' AND

"SD"."TRANSACTION_TYPE"='S')

|   0 | SELECT STATEMENT     |               | 99050 |  4933K| 21468 |
       |      |            |
|   1 |  SORT GROUP BY NOSORT|               | 99050 |  4933K| 21468 |
       |      |            |

|* 2 | TABLE ACCESS FULL | SALES_DETAIL | 99050 | 4933K| 21468 | 83,00 | P->S | QC (RAND) |

Predicate Information (identified by operation id):

2 - filter("SD"."LOCATION_ID"=749 AND "SD"."SALES_DATE"='2004-06-26' AND
"SD"."TRANSACTION_TYPE"='S') I don't know what else to include.. my chief question is what can I check to determine why PROD chose the plan it did vs TEST? The init.ora params are the same (except regarding #processors) Test is a clone of PROD only about a month old (stale). Received on Tue Aug 03 2004 - 16:14:50 CDT