Re: Tricky SQL Problem: Top "n" of queried records

From: x <tzadkiel_at_surfnetcity.com.au>
Date: 1998/03/22
Message-ID: <MPG.f7fcf3bf95124da989680_at_news.randori.com>#1/1


[This followup was posted to comp.databases.oracle and a copy was sent to the cited author.]

Back to the ranking question.

Once again, ROWNUM doesn't help us here. Rownum tells us the order in which rows were retrieved from the database BEFORE any sort/selection criteria are applied. Sorting by salary generally gives us results like:

deptno	salary	rownum
sales	40000	3
mis	30000	7
mfg	25000	1
acct	23000	4

.....

Trying to get the max salary through syntax like "where rownum = 1 order by salary" will sure 'nuff give us one row, but it's very unlikely to be the one we want.

Additionally, the ROWNUM result is not consistent. A database reorganisation, a record locked, etc. etc. can change the rownum associated with a row. Overall, it's just not as useful a psuedo-column as it appears it is at first. For all intents and purposes, you should consider it a unique number randomly applied across a query.

You can, however, get the result you are looking for with a little more complex query.

First, we get total salary by department with a basic group function: Q1:

select deptno, sum(salary) sumsal
from emp
group by deptno

Next, get the rankings within the query by joining this as a pair of identical in-line views

Q2:
select a.deptno, a.sumsal, count(*) ranking from {--Q1--} a, {--Q1--} b
where b.sumsal > a.sumsal

   or (b.sumsal = a.sumsal and b.deptno >= a.deptno) group by a.deptno, a.sumsal

This gives you the ranking you are actually looking for, and will be the same every time it is executed over the same data.

To limit it to the top three rows, you can add "having count(*) < 4" to the end of Q2, or you can next Q2 as an in-line view and have your construct as:

Q3:
select deptno, sumsal, ranking
from {--Q2--}
where ranking between 1 and 3
/

Or, expanding it to the actual SQL:

select deptno, sumsal, ranking

from 	(select a.deptno, a.sumsal, count(*) ranking
	from 	(select deptno, sum(salary) sumsal
		from emp
		group by deptno) a,
		(select deptno, sum(salary) sumsal
		from emp
		group by deptno) b
	where b.sumsal > a.sumsal
	   or (b.sumsal = a.sumsal and b.deptno >= a.deptno)
	group by a.deptno, a.sumsal)

where ranking between 1 and 3
/

Now, if you want to you can go buy the "101 SQL question" book that other guy keeps flogging and use whatever their answer is. I'm sure it's better than mine, since theirs is printed nice in a book and everything. They probably got fancy letters after their names, too.

I came up with this one on my own, though, and I'm pretty proud of it because I think it's pretty cool. I'm just a self-taught country programmer, but i do ok sometimes. And i feel pretty strongly that information is something better spread around than kept to one's ownself.

tzad

uncle tzad's house o' SQL

(uncle tzad is available to do oracle work or teach other people to do oracle work at what he thinks are reasonable rates. you can talk to him more at tzadkiel_at_surfnetcity.com.au )

In article <6elf5h$l0c$1_at_nnrp1.dejanews.com>, rtgraf_at_sintec.de says...
> How can I restrict the result of a SQL query to, say, the top 10 regarding
> some defined order criteria, e.g.
>
> SELECT deptno, sum(salary) FROM emp
> GROUP BY deptno
> <ONLY the top 3 w/regards to sum(salary)>;
>
> I do *not* want the client application to fetch only the first three records,
> I want the Server to retrieve exactly these rows.
>
> I have a rather quirky solution using a multiply nested subquery (one nesting
> level per "n" in the topic!)
>
> The problem is in fact a general one and I seem to have read a solution
> somewhere but forgot the source ;-(
>
> -----== Posted via Deja News, The Leader in Internet Discussion ==-----
> http://www.dejanews.com/ Now offering spam-free web-based newsreading
>
Received on Sun Mar 22 1998 - 00:00:00 CET

Original text of this message