Re: Periodic Stats Collection -- CBO Stats Myth? -- delta refresh of statistics

From: Tim Gorman <tim_at_evdbt.com>
Date: Mon, 06 Oct 2008 13:39:10 -0600
Message-ID: <48EA695E.8060203@evdbt.com>




  


Jaromir,

As usual, it depends on what you mean by "delta refresh" of the statistics.  Using the MONITORING table attribute and the GATHER STALE options in DBMS_STATS, you can gather stats on only tables which have had more than 10% of rows inserted, updated, and/or deleted, which is a one way to perform "delta refresh".  If large tables are partitioned, then you can use GATHER STALE to only gather statistics on those partitions which have changed by 10%.  For large tables, I generally only gather statistics at the partition and sub-partition level, then use my procedure SUMMARIZE_STATISTICS (available online at http://www.EvDBT.com/summarize_stats.sql) to summarize partition-level statistics to the "global" level.  SUMMARIZE_STATS doesn't gather statistics, it "sets" them (using DBMS_STATS) by summarizing/averaging partition-level statistics to generate "global" stats.

Hope this helps...
Tim Gorman
consultant - Evergreen Database Technologies, Inc.
P.O. Box 630791, Highlands Ranch CO  80163-0791
website   = http://www.EvDBT.com/
email     = Tim@EvDBT.com
mobile    = +1-303-885-4526
fax       = +1-303-484-3608
Yahoo IM  = tim_evdbt


jaromir nemec wrote:
Hello Tim, Wolfgang,
 

> Just how inaccurate is part of the game-theory that we all use when deciding between COMPUTE STATISTICS and ESTIMATE STATISTICS.
 
I’m wondering why Oracle doesn’t support the third option: delta refresh of the statistics. With the strong support of materialized view and fast refresh, I can imagine this will be a very similar concept. Currently if I use the terms of MV only full refresh of statistics is possible (or a full refresh based on sample).
 
> …and determine the min and max values for columns based on the values in the rows returned by the sample.
 
The min and max will be solved natively. Possible, some obstacles will arise by distinct values and histograms. But if I understand it correctly, there already exist some algorithm solving this for maintaining the global statistics of a partition table after gathering of the statistics of a single partition. (I’m not sure if this algorithm is published).
Of course there will be data categories where the delta refresh would not work very precise, but I guess for majority of business data it could yield good results.
 

Any comments?
 
Regards,
 
Jaromir D.B. Nemec
-- http://www.freelists.org/webpage/oracle-l Received on Mon Oct 06 2008 - 14:39:10 CDT

Original text of this message