| Oracle FAQ | Your Portal to the Oracle Knowledge Grid | |
|  |  | |||
Home -> Community -> Usenet -> c.d.o.misc -> Re: top N with respect to day and other field
On Aug 9, 2:54 pm, Nikolas Tautenhahn <virt..._at_gmx.de> wrote:
> Hi there,
>
> I have a little problem building a query which selects top-N tuples with
> respect to day and an other field.
>
> Tha base table contains aggregated data for a web server, there are
> different directories with pages inside and hits and this is aggregated
> for a whole day for each directory (which is represented by a foreign
> key) and for each page (also a fk).
> This base table might, for example, look like
> c_date    lup_page   lup_dir    num_hits
> 08.08.07  10         200        318
> 08.08.07  20         200        713
> 08.08.07  30         200        324
> 08.08.07  40         300        546
> 08.08.07  50         300        834
> 08.08.07  70         400        345
> 07.08.07  10         200        297
> 07.08.07  20         200        246
> 07.08.07  40         300        315
> 07.08.07  50         300        634
> 07.08.07  60         300        324
> 07.08.07  70         400        344
>
> i want to select from this table:
> the top-N pages for each directory in regard of hits for each day
>
> so, for example if we want the top-2 pages it should look like:
>
> 08.08.07  30         200        324
> 08.08.07  20         200        713
> 08.08.07  50         300        834
> 08.08.07  40         300        546
> 08.08.07  70         400        345
> 07.08.07  10         200        297
> 07.08.07  20         200        246
> 07.08.07  50         300        634
> 07.08.07  60         300        324
> 07.08.07  70         400        344
>
> the difficulties I had so far: For a fixed directory and a fixed date
> you would write
> select * from (select * from log where lup_dir=X and c_date=Y order by
> num_hits desc) where rownum <= 2
>
> But I have serious problems with modeling this in one closed statement
> for all lup_dirs and all dates...
>
> Oracle Version is 10.2.0.1.0
>
> Thanks for all help...
>
> Regards,
> N.
This might be a good application for analytical functions.  First, the
set up:
CREATE TABLE T1(
  C_DATE DATE,
  LUP_PAGE NUMBER(10),
  LUP_DIR NUMBER(10),
  NUM_HITS NUMBER(10));
INSERT INTO T1 VALUES ('08-AUG-2007',10,200,318);
INSERT INTO T1 VALUES ('08-AUG-2007',20,200,713);
INSERT INTO T1 VALUES ('08-AUG-2007',30,200,324);
INSERT INTO T1 VALUES ('08-AUG-2007',40,300,546);
INSERT INTO T1 VALUES ('08-AUG-2007',50,300,834);
INSERT INTO T1 VALUES ('08-AUG-2007',70,400,345);
INSERT INTO T1 VALUES ('07-AUG-2007',10,200,297);
INSERT INTO T1 VALUES ('07-AUG-2007',20,200,246);
INSERT INTO T1 VALUES ('07-AUG-2007',40,300,315);
INSERT INTO T1 VALUES ('07-AUG-2007',50,300,634);
INSERT INTO T1 VALUES ('07-AUG-2007',60,300,324);
INSERT INTO T1 VALUES ('07-AUG-2007',70,400,344);
Now, a quick test with the DENSE_RANK analytical function, segregating
the data any time C_DATE or LUP_DIR changes, and ordering by the
NUM_HITS in descending order:
LUP_PAGE, LUP_DIR, NUM_HITS,
C_DATE LUP_PAGE LUP_DIR NUM_HITS RANKING
--------- ---------- ---------- ---------- ---------- 07-AUG-07 10 200 297 1 07-AUG-07 20 200 246 2 07-AUG-07 50 300 634 1 07-AUG-07 60 300 324 2 07-AUG-07 40 300 315 3 07-AUG-07 70 400 344 1 08-AUG-07 20 200 713 1 08-AUG-07 30 200 324 2 08-AUG-07 10 200 318 3 08-AUG-07 50 300 834 1 08-AUG-07 40 300 546 2 08-AUG-07 70 400 345 1
Now, we need to restrict the rows returned to only those with RANKING
of 1 or 2.  It is not possible to place analytic functions in a WHERE
clause, so we need to slide the above SQL statement into an inline
view:
SELECT
  C_DATE,
LUP_PAGE, LUP_DIR, NUM_HITS
    LUP_PAGE,
    LUP_DIR,
    NUM_HITS,
--------- ---------- ---------- ---------- 08-AUG-07 20 200 713 08-AUG-07 30 200 324 08-AUG-07 50 300 834 08-AUG-07 40 300 546 08-AUG-07 70 400 345 07-AUG-07 10 200 297 07-AUG-07 20 200 246 07-AUG-07 50 300 634 07-AUG-07 60 300 324 07-AUG-07 70 400 344
Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.
Received on Thu Aug 09 2007 - 20:28:45 CDT
|  |  |