How to create an id variable indicating group? [message #20999] |
Fri, 05 July 2002 05:59 |
karen
Messages: 19 Registered: January 2002
|
Junior Member |
|
|
Hello,
I'm using the following query to find duplicate records in my data set (maybe it's not the best, but it works -- please feel free to suggest something better):
create table dupes as (select * from
(select count(*) as cnt, * from
unew group by consumid, dos, dx1) where cnt > 1);
To help me process my duplicates, I would like to create a new variable that has the same value for each set of duplicates. For instance, if there are two rows with consumid=12345, dos=1/1/01, and dx1=12, then I'd want both rows to have a new variable called dupeid, each with the value of 1 (let's say). Let's say the next set of duplicates have these values: consumid=12346, dos=1/2/01, and dx1=13. If there are three duplicates of these values, then each row in my dupes table with these values would have dupeid=2. Obviously I don't care what dupeid is equal to as long as it uniquely identifies the duplicate groups. Can anyone help me?
thanks for any assistance,
karen
|
|
|
Re: How to create an id variable indicating group? [message #21005 is a reply to message #20999] |
Fri, 05 July 2002 13:19 |
|
Mahesh Rajendran
Messages: 10707 Registered: March 2002 Location: oracleDocoVille
|
Senior Member Account Moderator |
|
|
Okay.
I assuming the table emp here with these duplicates.
SQL> select empno,ename,deptno from emp;
EMPNO ENAME DEPTNO
---------- ---------- ----------
7369 SMITH 20
7499 ALLEN 30
7521 WARD 30
7566 JONES 20
7654 MARTIN 30
7698 BLAKE 30
7782 CLARK 10
7788 SCOTT 20
7839 KING 10
7844 TURNER 30
7876 ADAMS 20
7900 JAMES 30
7902 FORD 20
7934 MILLER 10
7782 CLARK 10
7839 KING 10
7934 MILLER 10
7369 SMITH 20
7566 JONES 20
7788 SCOTT 20
7876 ADAMS 20
7902 FORD 20
7782 CLARK 10
7839 KING 10
7934 MILLER 10
7782 CLARK 10
7839 KING 10
7934 MILLER 10
28 rows selected.
Now, this SQL gets only the count of duplicate rows
along with a generated dupeid (which is based on
a Unique column from the same table).
SQL> ed
Wrote file afiedt.buf
1 select empno,count(empno),'du'||empno Dupeid from emp where rowid not in
2 (
3 select max(rowid) from emp
4 group by empno)
5* group by empno
SQL> /
EMPNO COUNT(EMPNO) DUPEID
---------- ------------ ------------------------------------------
7369 1 du7369
7566 1 du7566
7782 3 du7782
7788 1 du7788
7839 3 du7839
7876 1 du7876
7902 1 du7902
7934 3 du7934
8 rows selected.
NOTE
from the above output, empno 7934 is showing 3 counts
(duplicated).Actually in table 7934 is found 4 times.
so, it is duplicated 3 times. OK?
if you want to show as empno 7934 is found 4 times
(ie..including the original and duplicates)
use the following.
SQL> ed
Wrote file afiedt.buf
1 select empno,count(empno)+1,'du'||empno Dupeid from emp where rowid not in
2 (
3 select max(rowid) from emp
4 group by empno)
5* group by empno
SQL> /
EMPNO COUNT(EMPNO)+1 DUPEID
---------- -------------- ------------------------------------------
7369 2 du7369
7566 2 du7566
7782 4 du7782
7788 2 du7788
7839 4 du7839
7876 2 du7876
7902 2 du7902
7934 4 du7934
If any of above is not helping you, Let me know.
|
|
|