Re: Multiple use of RANK analytic in a single query
Date: Tue, 15 Nov 2011 03:52:25 -0800 (PST)
Message-ID: <37de7f07-58f6-454d-997b-8b5fdc470592_at_m7g2000vbc.googlegroups.com>
On Nov 13, 7:53 pm, Geoff Muldoon <geoff.muld..._at_trap.gmail.com> wrote:
> Hi all,
>
> Using: 11.2.0.1 on RHEL5 64bit RAC cluster ...
>
> I'm a novice in using the RANK analytical function, struggling with the
> basic syntax of it let alone trying to figure out if what I'm trying to
> achieve is possible, maybe even using subqueries ...
>
> Data like this:
>
> TX ID DEPT LOCATION LOAD
> 1 99 A NY 12
> 2 99 A LA 10
> 3 99 B LA 05
> 4 77 B LA 15
> 5 77 C NY 12
> 6 77 D LA 11
>
> Desired result like this:
>
> TX ID DEPT DEPT_RANK LOCATION LOC_RANK LOAD
> 1 99 A 1 NY 2 12
> 2 99 A 1 LA 1 10
> 3 99 B 2 LA 1 05
> 4 77 B 1 LA 1 15
> 5 77 C 2 NY 2 12
> 6 77 D 3 LA 1 11
>
> DEPT_RANK for ID 99 is 1 for A because sum(LOAD) = 22 is the max
> LOC_RANK for ID 99 is 1 for LA because sum(LOAD) = 15 is the max
> ... etc
>
> Is this actually possible, and if it is could anyone please assist with
> an example of the proper use of the WITHIN and/or PARTITION BY parts of
> RANK to achieve it?
>
> TIA,
>
> Geoff M
Geoff,
A sample table with test data would be helpful. For your first post
in this thread:
CREATE TABLE T1 (
TX NUMBER,
ID NUMBER,
DEPT VARCHAR2(1),
LOCATION VARCHAR2(2),
LOAD NUMBER);
INSERT INTO T1 VALUES (1,99,'A','NY',12); INSERT INTO T1 VALUES (2,99,'A','LA',10); INSERT INTO T1 VALUES (3,99,'B','LA',05); INSERT INTO T1 VALUES (4,77,'B','LA',15); INSERT INTO T1 VALUES (5,77,'C','NY',12); INSERT INTO T1 VALUES (6,77,'D','LA',11);
I think that what needs to be done is to build the solution in
stages. First, let's calculate the SUM values for the rows - I took a
best guess at the partition clause using the raw data that you
provided and the expected output:
SELECT
TX,
ID,
DEPT,
SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
LOCATION,
SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
LOAD
FROM
T1
ORDER BY
TX;
TX ID D SUM_LOAD_ID LO SUM_LOAD_LOCATION LOAD
--- --- - ----------- -- ----------------- ----- 1 99 A 22 NY 12 12 2 99 A 22 LA 15 10 3 99 B 5 LA 15 5 4 77 B 15 LA 26 15 5 77 C 12 NY 12 12 6 77 D 11 LA 26 11
Hopefully, you agree with the calculated SUM values shown above. If
we then take the above SQL statement and slide it into an inline view,
we are able to rank the values in the SUM_LOAD_ID and
SUM_LOAD_LOCATION columns. First, let's try something that does not
work:
SELECT
TX,
ID,
DEPT,
RANK() OVER (PARTITION BY ID,DEPT ORDER BY SUM_LOAD_ID DESC)
DEPT_RANK,
LOCATION,
RANK() OVER (PARTITION BY ID,DEPT ORDER BY SUM_LOAD_LOCATION DESC)
LOC_RANK,
LOAD
FROM
(SELECT
TX,
ID,
DEPT,
SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
LOCATION,
SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
LOAD
FROM
T1)
ORDER BY
TX;
TX ID D DEPT_RANK LO LOC_RANK LOAD --- --- - ---------- -- ---------- -----
1 99 A 1 NY 2 12 2 99 A 1 LA 1 10 3 99 B 1 LA 1 5 4 77 B 1 LA 1 15 5 77 C 1 NY 1 12 6 77 D 1 LA 1 11
Quite clearly, the above did not work. We want to rank the
SUM_LOAD_ID and SUM_LOAD_LOCATION columns within each ID partition.
Let's try again:
SELECT
TX,
ID,
DEPT,
RANK() OVER (PARTITION BY ID ORDER BY SUM_LOAD_ID DESC) DEPT_RANK,
LOCATION,
RANK() OVER (PARTITION BY ID ORDER BY SUM_LOAD_LOCATION DESC)
LOC_RANK,
LOAD
FROM
(SELECT
TX,
ID,
DEPT,
SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
LOCATION,
SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
LOAD
FROM
T1)
ORDER BY
TX;
TX ID D DEPT_RANK LO LOC_RANK LOAD --- --- - ---------- -- ---------- -----
1 99 A 1 NY 3 12 2 99 A 1 LA 1 10 3 99 B 3 LA 1 5 4 77 B 1 LA 1 15 5 77 C 2 NY 3 12 6 77 D 3 LA 1 11
The above is closer to the desired result - except that rows with
equal values in the SUM_LOAD_ID and SUM_LOAD_LOCATION columns cause a
ranking number to be skipped. If we change the RANK() function to a
ROW_NUMBER() function then no numbers will be skipped, but then rows
with equal values will be assigned a different rank. The DENSE_RANK()
analytic function should yield the desired results:
SELECT
TX,
ID,
DEPT,
DENSE_RANK() OVER (PARTITION BY ID ORDER BY SUM_LOAD_ID DESC)
DEPT_RANK,
LOCATION,
DENSE_RANK() OVER (PARTITION BY ID ORDER BY SUM_LOAD_LOCATION DESC)
LOC_RANK,
LOAD
FROM
(SELECT
TX,
ID,
DEPT,
SUM(LOAD) OVER (PARTITION BY ID,DEPT) SUM_LOAD_ID,
LOCATION,
SUM(LOAD) OVER (PARTITION BY ID,LOCATION) SUM_LOAD_LOCATION,
LOAD
FROM
T1)
ORDER BY
TX;
TX ID D DEPT_RANK LO LOC_RANK LOAD --- --- - ---------- -- ---------- -----
1 99 A 1 NY 2 12 2 99 A 1 LA 1 10 3 99 B 2 LA 1 5 4 77 B 1 LA 1 15 5 77 C 2 NY 2 12 6 77 D 3 LA 1 11
Are you able to provide a sample table with data for your real-ish code? It is difficult to determine the original data values from the two output examples that you provided.
Charles Hooper
http://hoopercharles.wordpress.com/
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.
Received on Tue Nov 15 2011 - 05:52:25 CST