Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.misc -> Re: top N with respect to day and other field

Re: top N with respect to day and other field

From: Charles Hooper <hooperc2000_at_yahoo.com>
Date: Thu, 09 Aug 2007 18:28:45 -0700
Message-ID: <1186709325.848699.130350@z24g2000prh.googlegroups.com>


On Aug 9, 2:54 pm, Nikolas Tautenhahn <virt..._at_gmx.de> wrote:
> Hi there,
>
> I have a little problem building a query which selects top-N tuples with
> respect to day and an other field.
>
> Tha base table contains aggregated data for a web server, there are
> different directories with pages inside and hits and this is aggregated
> for a whole day for each directory (which is represented by a foreign
> key) and for each page (also a fk).
> This base table might, for example, look like
> c_date lup_page lup_dir num_hits
> 08.08.07 10 200 318
> 08.08.07 20 200 713
> 08.08.07 30 200 324
> 08.08.07 40 300 546
> 08.08.07 50 300 834
> 08.08.07 70 400 345
> 07.08.07 10 200 297
> 07.08.07 20 200 246
> 07.08.07 40 300 315
> 07.08.07 50 300 634
> 07.08.07 60 300 324
> 07.08.07 70 400 344
>
> i want to select from this table:
> the top-N pages for each directory in regard of hits for each day
>
> so, for example if we want the top-2 pages it should look like:
>
> 08.08.07 30 200 324
> 08.08.07 20 200 713
> 08.08.07 50 300 834
> 08.08.07 40 300 546
> 08.08.07 70 400 345
> 07.08.07 10 200 297
> 07.08.07 20 200 246
> 07.08.07 50 300 634
> 07.08.07 60 300 324
> 07.08.07 70 400 344
>
> the difficulties I had so far: For a fixed directory and a fixed date
> you would write
> select * from (select * from log where lup_dir=X and c_date=Y order by
> num_hits desc) where rownum <= 2
>
> But I have serious problems with modeling this in one closed statement
> for all lup_dirs and all dates...
>
> Oracle Version is 10.2.0.1.0
>
> Thanks for all help...
>
> Regards,
> N.

This might be a good application for analytical functions. First, the set up:
CREATE TABLE T1(
  C_DATE DATE,

  LUP_PAGE NUMBER(10),
  LUP_DIR NUMBER(10),
  NUM_HITS NUMBER(10));

INSERT INTO T1 VALUES ('08-AUG-2007',10,200,318);
INSERT INTO T1 VALUES ('08-AUG-2007',20,200,713);
INSERT INTO T1 VALUES ('08-AUG-2007',30,200,324);
INSERT INTO T1 VALUES ('08-AUG-2007',40,300,546);
INSERT INTO T1 VALUES ('08-AUG-2007',50,300,834);
INSERT INTO T1 VALUES ('08-AUG-2007',70,400,345);
INSERT INTO T1 VALUES ('07-AUG-2007',10,200,297);
INSERT INTO T1 VALUES ('07-AUG-2007',20,200,246);
INSERT INTO T1 VALUES ('07-AUG-2007',40,300,315);
INSERT INTO T1 VALUES ('07-AUG-2007',50,300,634);
INSERT INTO T1 VALUES ('07-AUG-2007',60,300,324); INSERT INTO T1 VALUES ('07-AUG-2007',70,400,344); Now, a quick test with the DENSE_RANK analytical function, segregating the data any time C_DATE or LUP_DIR changes, and ordering by the NUM_HITS in descending order:
SELECT
  C_DATE,
  LUP_PAGE,
  LUP_DIR,
  NUM_HITS,

  DENSE_RANK() OVER (PARTITION BY C_DATE,LUP_DIR ORDER BY NUM_HITS DESC) RANKING
FROM
  T1;

C_DATE LUP_PAGE LUP_DIR NUM_HITS RANKING

--------- ---------- ---------- ---------- ----------
07-AUG-07         10        200        297          1
07-AUG-07         20        200        246          2
07-AUG-07         50        300        634          1
07-AUG-07         60        300        324          2
07-AUG-07         40        300        315          3
07-AUG-07         70        400        344          1
08-AUG-07         20        200        713          1
08-AUG-07         30        200        324          2
08-AUG-07         10        200        318          3
08-AUG-07         50        300        834          1
08-AUG-07         40        300        546          2
08-AUG-07         70        400        345          1

Now, we need to restrict the rows returned to only those with RANKING of 1 or 2. It is not possible to place analytic functions in a WHERE clause, so we need to slide the above SQL statement into an inline view:
SELECT
  C_DATE,

  LUP_PAGE,
  LUP_DIR,
  NUM_HITS

FROM
  (SELECT
    C_DATE,
    LUP_PAGE,
    LUP_DIR,
    NUM_HITS,

    DENSE_RANK() OVER (PARTITION BY C_DATE,LUP_DIR ORDER BY NUM_HITS DESC) RANKING
  FROM
    T1)
  WHERE
    RANKING<=2
ORDER BY
  C_DATE DESC,
  LUP_DIR,
  NUM_HITS DESC; C_DATE LUP_PAGE LUP_DIR NUM_HITS
--------- ---------- ---------- ----------
08-AUG-07         20        200        713
08-AUG-07         30        200        324
08-AUG-07         50        300        834
08-AUG-07         40        300        546
08-AUG-07         70        400        345
07-AUG-07         10        200        297
07-AUG-07         20        200        246
07-AUG-07         50        300        634
07-AUG-07         60        300        324
07-AUG-07         70        400        344

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc. Received on Thu Aug 09 2007 - 20:28:45 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US