Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Estimating Statistics > 50%

Re: Estimating Statistics > 50%

From: Jan Gelbrich <j_gelbrich_at_westfalen-blatt.de>
Date: Wed, 28 Jul 2004 14:19:08 +0200
Message-ID: <2mpk5bFplqh9U1@uni-berlin.de>


"Jeffrey Hunter" <jhunter_at_iDevelopment.info> schrieb im Newsbeitrag news:9dfc69b4.0407270737.56368d81_at_posting.google.com...
> Hello everyone. This is an issue that has always bothered me when
> reading the Oracle documentation on estimating statistics for the CBO.
> I am sure that this may have been talked about in the archives but was
> unable to find anything.
>
> In the SQL guide while defining the "analyze" command, it indicates
> that you can estimate statistics by providing a "sample" of the data
> to collect stats for. One comment they provide is:
>
> "If you specify more than half of the data, then Oracle reads all the
> data and computes the statistics."
>
> Well, what about the "percent" option and its range of 1 .. 99? Why
> would you specify *any* value >= 50% since by their comment, it would
> compute stats rather than estimate them?
>
> My assumption is that this comment only applies to sampling using the
> "rows" option and not the "percent" option.
>
> Am I correct in this assumption?
>
> Thanks in advance to everyone!
>
> -- jeff

Not being a guru about it, but an estimation with any sample rate remains an estimation of better or worse reliability, only a computation would be _fully reliable_. This is not only valid for Oracle, but for statistics in general. Estimation has some flavor of gambling. Someone created a saying
"Never beleive a statistic that I did not false by myself !" In Oracle, the wheel to play falsing is the sample rate.

In many cases, estimations may hit reality quite good; and then there are cases where estimations go terribly wrong, especially the smaller the sample is.

The reason to prefer estimations over computations is AFAIK only the limit of response time of that operation, so: the bigger Your tables are, the longer an analyze computation will take, and an estimation reduces this time dramatically.

My DBs are fairly small, so I analyze always by compute, not estimate.

Just my 2c, Jan. Received on Wed Jul 28 2004 - 07:19:08 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US