Re: Limiting # of hits via SQL

From: DRathbun <drathbun_at_aol.com>
Date: 1995/06/13
Message-ID: <3rl03g$ll2_at_newsbf02.news.aol.com>#1/1


In article <JOHNS.95Jun13115024_at_adwyer.gsfc.nasa.gov>, johns_at_adwyer.gsfc.nasa.gov (Steve Johns) writes:

>Subject: Limiting # of hits via SQL
>
>Can someone mention to me how to tell SQLplus to only return the first
>'n' hits?
>
>We want to collect some stats automatically, and so we want a
>descending sort of access count on some DB items we serve, and we just
>want the top ten.

This is not a trivial process in SQL*Plus. You can limit the number of rows returned by adding a WHERE (or AND) ROWNUM <= 10 clause to your sql script, but that will only give you the first ten rows obtained by the query engine regardless of their relative positions in a sort clause. Adding an ORDER BY 'x' clause will order the resulting 10 rows. What you would really like is to ORDER BY _first_, then limit to the top 10 rows. I have managed to accomplish this in two different ways, depending on your data.

If the column that you want to sort by is a discrete value (i.e. no SUM operation required) then the following sql script might work for you:

SELECT col1, col2,..., access_count
  FROM foo
WHERE access_count IN

	(
	SELECT access_count
	  FROM foo foo1
	 WHERE 10 >
	  (SELECT count(*)
	     FROM foo foo2
	    WHERE foo2.access_count > foo1.access_count)
	)

ORDER BY access_count desc;

Every time I look at this I think that one of the > symbols should be a "<" instead, but if you try this code you should see that it works.

In normal language: obtain an ACCESS_COUNT value, then count how many other ACCESS_COUNT values are greater than that value. If there are 9 or fewer values that are greater than that value then that value is one of your top ten entries. However, if 10 or more values are greater than your selected value, then it can't be in the top 10 ('cause there are at least 10 values higher than it). Got it? (whew).

Note a couple of issues: 1) this is not an efficient algorithm for large tables (!) so be careful there; 2) If there is a 'tie' for 10th place then you will actually get 11 rows back - this is a feature or problem depending on your definition of the problem <grin>;

Now for the second part of the solution: if ACCESS_COUNT is not a discrete value, then you must create a view that does the SUM, such as:

CREATE VIEW stats_total
AS
SELECT col1, col2, ...,

         SUM(access_count) access_sum
  FROM foo
WHERE … << if required or appropriate >> GROUP BY col1, col2, ...

This serves to group your stats, which then allows you to apply the previous sql code against the view instead of the base table. If anybody else has a better solution (or even a different solution) I would appreciate feedback on this!

Regards,

Dave Rathbun
Integra Solutions
Dallas, TX
DRathbun_at_AOL.com Received on Tue Jun 13 1995 - 00:00:00 CEST

Original text of this message