the incredible shrinking sample size

From: Robyn <robyn.sands_at_gmail.com>
Date: Fri, 13 Jun 2008 17:06:09 -0400
Message-ID: <ece8554c0806131406x45a090c1p80b0815c1e05f5a9@mail.gmail.com>


Happy Friday to all ...

I'm testing a new set of stats gathering scripts on an older database ( 9.2.0.6). I've used the auto sample size in the past and found it to capture a reasonable sample size. However this time, the sample size is decreasing dramatically after the 3rd or 4th collection.

This statement was used to capture stats yesterday:

exec sys.dbms_stats.gather_schema_stats(ownname =>'XXXXXX', estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE,CASCADE => FALSE,method_opt
=>'FOR COLUMNS SIZE 1',granularity =>'ALL',degree
=>DBMS_STATS.DEFAULT_DEGREE,options =>'GATHER STALE');

Results for the tables in question were:

Jun 12th

TABLE_NAME                       NUM_ROWS SAMPLE_SIZE LAST_ANALYZED
MON
------------------------------ ---------- ----------- -------------------
---
XXXXXXXX                           445663       58258 2008-06-11 15:02:47
YES
XXXXXXXXXX_XXXX                 358909200     5134743 2008-06-11 15:41:30
YES
XXXXXXXXXX_XXXXXXX              273789954       54170 2008-06-11 15:53:31
YES These samples sizes were already a little smaller than the last collection, but the summary job that uses them completed in under 2 minutes so life was good. All three tables are partitioned and new partitions are added each night. Index stats are captured separately at 100% and a separate script is run to capture stats on newly loaded partitions (by looking for partitions with 0 rows), but I wanted to add a statement to collect stats for objects that had not been analyzed at all.

So today, this command was executed:

exec sys.dbms_stats.gather_schema_stats(ownname
=>'XXXXX',estimate_percent=>100,

CASCADE => TRUE, method_opt =>'FOR COLUMNS SIZE 1',granularity
=>'ALL',degree =>DBMS_STATS.DEFAULT_DEGREE,options =>'GATHER EMPTY');

And these are the new results for the tables:

Jun 13th

TABLE_NAME                       NUM_ROWS SAMPLE_SIZE LAST_ANALYZED
MON
------------------------------ ---------- ----------- -------------------
---
XXXXXXXX                           472398        5896 2008-06-13 16:02:29
YES
XXXXXXXXXX_XXXX                 377727480        6149 2008-06-13 16:04:26
YES
XXXXXXXXXX_XXXXXXX              284470673        5581 2008-06-13 16:05:26
YES I meant to only get stats on objects that had never been analyzed however, since I left the granularity at ALL, the global stats for the tables with new partitions were regathered. (at least that's my interpretation) That I can fix.

However .... the greatly reduced sample size is not acceptable. Until now, auto sample sizes worked well for me - small tables were calculated at 100% and larger ones at a reasonable percentage. Tonight's summarization job would return to nested loops if I left these stats in place, which would be a very ugly thing. (Fortunately, I can restore the old version.)

Has anyone else seen this before? Is this a 9.2.0.6 thing or a side effect of the data or did I just get lucky with auto until now?

tia ... Robyn

-- 
I may not have gone where I intended to go, but I think I have ended up
where I needed to be.
Douglas Adams

--
http://www.freelists.org/webpage/oracle-l
Received on Fri Jun 13 2008 - 16:06:09 CDT

Original text of this message