Dan Tow's SQL formula

From: amonte <ax.mount_at_gmail.com>
Date: Sun, 22 Feb 2009 20:15:33 +0100
Message-ID: <85c1fb130902221115l4561e9ebk96e665826ac772ad_at_mail.gmail.com>



Hi

A few months ago I was asking some SQL tuning question and someone sggested me Dan Tow's SQL Tuning book release by Oreilly. So I bought the book a few weeks ago and just started to reading it.

I am a bit puzzled about the formula in page 30, calculating the filter selectivity for a query which searchs for last_name:

select ....
from customers
where last_name = 'SMITH'

And he suggested this formula to obtain the selectivity, sum of (n(i)/C') x
(n(i)/C) where n(i) is the count of rows wiyj ith nonnull last name (I
understand as count of all last_name) and C is count of all rows and C' is count of all nonnull last name which turns out to be following SQL statement
(which I dont understand)

SELECT SUM(COUNT(Last_Name)*COUNT(Last_Name)) /
(SUM(COUNT(Last_Name))*SUM(COUNT(*)))

FROM Customers
GROUP BY Last_Name;

To start with the formula is not exactly as described, there are are 3 SUM() instead of one.

To better my understanding I did a test with more familiar table which is scott.emp table with following query:

select sum(count(mgr)*count(mgr)) / (sum(count(mgr))*sum(count(*))) from emp
group by mgr

SUM(COUNT(MGR)*COUNT(MGR))/(SUM(COUNT(MGR))*SUM(COUNT(*)))


                                                .225274725


So I got a selectivity of roughly 0.225. Does not sound right to me because there are 6 managers and 13 employees (14 but one has no manager)

My question is really, does anyone understand this formula and actually use it?

If so please throw some lights :-)))

Thank you all

Alex

--
http://www.freelists.org/webpage/oracle-l
Received on Sun Feb 22 2009 - 13:15:33 CST

Original text of this message