Home » SQL & PL/SQL » SQL & PL/SQL » Using analytical functions (Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bit Production)
Using analytical functions [message #353072] |
Fri, 10 October 2008 22:15 |
jgreenleaf
Messages: 9 Registered: September 2008
|
Junior Member |
|
|
Hello everyone,
I am trying to understand analytical functions but I'm having some trouble. A problem I have been posed with is to "determine the top three zip codes where most of the students live using an analytical function." I believe I need to utilize DENSE_RANK() but I'm not sure. Any suggestions or examples would be appreciated! I have attached the student table.
|
|
|
|
|
|
|
Re: Using analytical functions [message #353188 is a reply to message #353072] |
Sun, 12 October 2008 12:14 |
jgreenleaf
Messages: 9 Registered: September 2008
|
Junior Member |
|
|
Thanks Kevin!
Let me show you what I've done so far although incorrect maybe you can point me in the right direction.
What I am trying to accomplish is to find the top 3 zip codes where most students live. The output is supposed to yield 10 rows. There are 268 students in the table. I have verified that I am supposed to be using a dense_rank() function but something about my grouping is not right. Here are some queries I have tried:
SELECT *
FROM (SELECT Zip,
Student_Id,
Dense_rank() OVER(PARTITION BY Zip ORDER BY Student_Id) AS Rank
FROM Student)
WHERE Rank <= 3;
ZIP STUDENT_ID RANK
----- ---------- ----------
11561 357 1
11565 257 1
11565 259 2
11565 275 3
11576 170 1
11717 315 1
11743 345 1
11746 234 1
11746 279 2
11746 380 3
11758 359 1
...
-- It goes on for 237 rows.
I have also tried:
SELECT *
FROM (SELECT Zip,
Dense_rank() OVER(ORDER BY Zip) AS Rank
FROM Student)
WHERE Rank <= 3;
ZIP RANK
----- ----------
01247 1
02124 2
02155 3
What I would really like to accomplish is something like the following example:
SELECT deptno,
ename,
sal,
RANK() OVER (PARTITION BY deptno ORDER BY sal) "rank"
FROM emp
WHERE deptno = 30;
DEPTNO ENAME SAL rank
---------- ---------- ---------- ----------
30 JAMES 950 1
30 WARD 1250 2
30 MARTIN 1250 2
30 TURNER 1500 4
30 ALLEN 1600 5
30 BLAKE 2850 6
It seems like when I partition based on zip and then try to order it by something else it doesn't come out right. I also tried to partition and order by zip and it just ranks everything as 1. Any pointers would be greatly appreciated!
Thank you.
|
|
|
Re: Using analytical functions [message #353195 is a reply to message #353188] |
Sun, 12 October 2008 13:01 |
|
Michel Cadot
Messages: 68718 Registered: March 2007 Location: Saint-Maur, France, https...
|
Senior Member Account Moderator |
|
|
Quote: | What I am trying to accomplish is to find the top 3 zip codes where most students live. The output is supposed to yield 10 rows.
|
2 points:
- Why 10 rows? Top 3 seems to return 3 rows (unless you want those of same "rank" but this is not the same top 3)
- "most students" seems to mean you have to first count the number of students in each zip code before knowing which ones are in the top 3
Regards
Michel
[Updated on: Sun, 12 October 2008 13:04] Report message to a moderator
|
|
|
Re: Using analytical functions [message #353199 is a reply to message #353072] |
Sun, 12 October 2008 13:35 |
jgreenleaf
Messages: 9 Registered: September 2008
|
Junior Member |
|
|
Ya I know that 10 rows sounds weird but you are right Michel they are expecting a listing of zipcodes with the same rank which is why I've been trying to use the dense_rank function as it will assign duplicate values to the same rank.
I also think you are correct with your second point about counting the number of students with the same zipcode. I've been thinking about that a lot, trying to think of a way to group them. Each student_id is unique so it seems I can't group by that. It seems like I should be grouping by zipcode (zip) but haven't really figured out how to do that. Rules of GROUP BY are different when using analytical functions right?
Thanks again!
|
|
|
|
Re: Using analytical functions [message #353220 is a reply to message #353072] |
Sun, 12 October 2008 23:56 |
|
Kevin Meade
Messages: 2103 Registered: December 1999 Location: Connecticut USA
|
Senior Member |
|
|
First lets us define what kind of query we are talking about. You want to do what is known as a TOP-N query. The problem with a top-n query is that they are often ambiguous. Consider the following question:
Quote: | show me the top three paid employees
|
The following scenarios show how this question is ambiguous:
create table employee (empno number,ename varchar2(10),salary number(9,2));
insert into employee values (1,'joe',1000);
insert into employee values (2,'jon',1000);
insert into employee values (3,'jan',1000);
insert into employee values (4,'jim',1000);
What is the answer for the above data? Is it:
Quote: | 1) joe,jon,jan (or some other combination of three)
2) joe,jon,jan,jim (because all are in the top three salarys)
|
Consider this scenario
delete from employee;
insert into employee values (1,'joe',1000);
insert into employee values (2,'jon',1000);
insert into employee values (3,'jan',1000);
insert into employee values (4,'jim',1000);
insert intp employee values (11,'sue',2000);
insert into employee values (21,'sal',2000);
insert into employee values (31,'set',2000);
insert into employee values (41,'sow',2000);
What is the answer for the data above?
Quote: | 1) sow,set,sal (or some other combination)
2) why not joe,jan,jan (or some combination) (these emps all have salaries in the top three)
3) joe,jon,jan,jim,sue,sal,set,sow (same reason as #2)
|
Also most interesting is this scenario:
delete from employee;
insert into employee values (1,'joe',1000);
insert into employee values (2,'jon',2000);
insert into employee values (3,'jan',2000);
insert into employee values (4,'jim',2000);
Who are your top 3 paid employess?:
Quote: | 1) jim,jan,jon
2) jim,jan,jon,joe
|
So, to do a top-n query, you have to know exactly what you are looking for because interpretations abound and all are valid.
Try going to ASKTOMHOME and search for:
Quote: | dense_rank
topn
top n
|
Kevin
|
|
|
Goto Forum:
Current Time: Wed Dec 04 18:37:07 CST 2024
|