the incredible shrinking sample size

From: Robyn <>
Date: Fri, 13 Jun 2008 17:06:09 -0400
Message-ID: <>

Happy Friday to all ...

I'm testing a new set of stats gathering scripts on an older database ( I've used the auto sample size in the past and found it to capture a reasonable sample size. However this time, the sample size is decreasing dramatically after the 3rd or 4th collection.

This statement was used to capture stats yesterday:

exec sys.dbms_stats.gather_schema_stats(ownname =>'XXXXXX', estimate_percent=>DBMS_STATS.AUTO_SAMPLE_SIZE,CASCADE => FALSE,method_opt
=>'FOR COLUMNS SIZE 1',granularity =>'ALL',degree

Results for the tables in question were:

Jun 12th

------------------------------ ---------- ----------- -------------------
XXXXXXXX                           445663       58258 2008-06-11 15:02:47
XXXXXXXXXX_XXXX                 358909200     5134743 2008-06-11 15:41:30
XXXXXXXXXX_XXXXXXX              273789954       54170 2008-06-11 15:53:31
YES These samples sizes were already a little smaller than the last collection, but the summary job that uses them completed in under 2 minutes so life was good. All three tables are partitioned and new partitions are added each night. Index stats are captured separately at 100% and a separate script is run to capture stats on newly loaded partitions (by looking for partitions with 0 rows), but I wanted to add a statement to collect stats for objects that had not been analyzed at all.

So today, this command was executed:

exec sys.dbms_stats.gather_schema_stats(ownname

CASCADE => TRUE, method_opt =>'FOR COLUMNS SIZE 1',granularity

And these are the new results for the tables:

Jun 13th

------------------------------ ---------- ----------- -------------------
XXXXXXXX                           472398        5896 2008-06-13 16:02:29
XXXXXXXXXX_XXXX                 377727480        6149 2008-06-13 16:04:26
XXXXXXXXXX_XXXXXXX              284470673        5581 2008-06-13 16:05:26
YES I meant to only get stats on objects that had never been analyzed however, since I left the granularity at ALL, the global stats for the tables with new partitions were regathered. (at least that's my interpretation) That I can fix.

However .... the greatly reduced sample size is not acceptable. Until now, auto sample sizes worked well for me - small tables were calculated at 100% and larger ones at a reasonable percentage. Tonight's summarization job would return to nested loops if I left these stats in place, which would be a very ugly thing. (Fortunately, I can restore the old version.)

Has anyone else seen this before? Is this a thing or a side effect of the data or did I just get lucky with auto until now?

tia ... Robyn

I may not have gone where I intended to go, but I think I have ended up
where I needed to be.
Douglas Adams

Received on Fri Jun 13 2008 - 16:06:09 CDT

Original text of this message