Re: Dan Tow's SQL formula

From: amonte <ax.mount_at_gmail.com>
Date: Thu, 26 Feb 2009 08:12:33 +0100
Message-ID: <85c1fb130902252312g3519b0b8i6c7b2dfb31115a2f_at_mail.gmail.com>

Hi

I am having some private mailing with Dan, basically I came up with 3 possible selectivities

1. Dan's formula gets a selectivity of 0.225274725
2. 1/NDV, 0.166666667
3. With histograms for different value we get different result if we use literals and not binds so for example mgr = 7698 we would get a selectivity of *count(mgr) where mgr = 7698 / count(mgr)* which is 5/13 which is 0.384615385. But for mgr = 7782 we would get an selectivity of 0.076923077 (much better and lower selectivity) but with binds we know what is the problem, the bind peeking

Alex

On Thu, Feb 26, 2009 at 6:49 AM, Yong Huang <yong321_at_yahoo.com> wrote:

> [Resend. I'm forced to trim quoted text down when posting a message.
> Complete original message can be found at
> http://www.freelists.org/post/oracle-l/Dan-Tows-SQL-formula]
>
> > To better my understanding I did a test with more familiar table which is
> > scott.emp table with following query:
> >
> > select sum(count(mgr)*count(mgr)) / (sum(count(mgr))*sum(count(*)))
> > from emp
> > group by mgr
> >
> > SUM(COUNT(MGR)*COUNT(MGR))/(SUM(COUNT(MGR))*SUM(COUNT(*)))
> > ----------------------------------------------------------
> > .225274725
> >
> >
> > So I got a selectivity of roughly 0.225. Does not sound right to me
> because
> > there are 6 managers and 13 employees (14 but one has no manager)
> >
> > My question is really, does anyone understand this formula and actually
> use
> > it?
> >
> > If so please throw some lights :-)))
> >
> >
> > Thank you all
> >
> > Alex
>
> I don't know the answer. But Dan Tow's formula looks very much like
> Jonathan Lewis' calculation of density when he describes histograms (see his
> book on p.172), or Ari Mozes's patent 6732085 (
> http://www.freepatentsonline.com/6732085.html) where he says "density can
> be calculated as the sum of the square of the repetition counts for
> non-popular values divided by the product of the number of rows in the table
> and the number of non-popular values in the table", which is kind of beyond
> my knowledge.
>
> Unfortunately, the number you get by applying Dan's formula to scott.emp,
> .225274725, is NOT the density for the mgr column, which is .038461538 on my
> database. But it's close to index selectivity (ix_sel) shown in 10053 trace
> when an index is created on mgr column and a query has "where mgr = <some
> number>" is parsed.
>
> Yong Huang
>
>
>
>
> --
> http://www.freelists.org/webpage/oracle-l
>
>
>

```--
http://www.freelists.org/webpage/oracle-l
```
Received on Thu Feb 26 2009 - 01:12:33 CST

Original text of this message