Re: Deleting all duplicate rows.
Date: 1996/11/10
Message-ID: <NEWTNews.847658943.20561.dafis_at_ppp.test.jccbi.gov>#1/1
In article <5650dq$al3_at_uwm.edu>, <gidwani_at_alpha1.csd.uwm.edu> writes:
> Path: news.ou.edu!news.nodak.edu!netnews1.nwnet.net!netnews.nwnet.net!arclight.uoregon.edu!nntp.primenet.com!uwm.edu!alpha1.csd.uwm.edu!gidwani
> From: gidwani_at_alpha1.csd.uwm.edu (Vinay K Gidwani)
> Newsgroups: comp.databases.oracle
> Subject: Deleting all duplicate rows.
> Date: 10 Nov 1996 16:38:18 GMT
> Organization: University of Wisconsin - Milwaukee, Computing Services
Division
> Lines: 32
> Distribution: world
> Message-ID: <5650dq$al3_at_uwm.edu>
> NNTP-Posting-Host: 129.89.169.1
> Keywords: Deleting all duplicate rows in a table
>
> I am trying to delete all duplicate rows in a table. The duplicates are to
> be determined on a few keys. The way I am thinking of implementing it
> now is
>
> Insert into temp_table as
> Select key1, key2, key3 from
> table
> group by key1, key2, key3
> having count(*) > 1
>
> Run a cursor through temp_table and delete all rows having
> keys key1, key2, key3 in table.
>
> This seems an overly complicated solution. Any suggestions on an
> easier way to do this.
>
> I have just started to work with Oracle and I would appreciate any
> suggestions on any good books. I am referring to George Koch
> Orcale THe Complete Reference which seems to be quite a
> comprehensive book.
>
> Thanks in Advance.
>
> Vinay.
>
>
>
>
>
>
> --
> email : gidwani_at_alpha1.csd.uwm.edu
>
You should probably create a UNIQUE Index over those key elements to prevent
duplicated from being added to the system. But first, look up the topic
"EXCEPTIONS" in you manuals and find the oracle script that build an
exceptions table. I can't remember the script, probably something like
PROCEXCP.SQL. After you create this table then...
If you add the line EXCEPTIONS INTO EXCEPTIONS at the bottom of the index build statement, then ALL rowid's for duplicate rows are put into the exceptions table.
Because ALL rows that have duplicate records are listed by row_id in the exceptions table, You need to be carefull. You only want to delete one of the duplicates from your table. I built a package that read the EXCPETIONS table and the did the following:
CURSOR FIND_DUPS IS SELECT KEY1, KEY2, KEY3, KEY4 FROM table_name WHERE ROW_ID IN (SELECT ROW_ID FROM EXCEPTIONS WHERE TABLE_NAME = 'table_name');
DUP_ROW = FIND_DUPS%ROW_TYPE;
CURSOR DUP_ROW IS
SELECT COUNT(*) FROM table_name A WHERE
- KEY1 = DUP_ROW.KEY1,
- KEY2 = DUP_ROW.KEY1,
- KEY3 = DUP_ROW.KEY1,
- KEY4 = DUP_ROW.KEY1; DUP_CNT = DUP_ROW%TYPE;
OPEN FIND_DUPS;
LOOP;
FETCH FIND_DUPS INTO DUP_ROW; EXIT WHEN FIND_DUP%NOTFOUND; OPEN DUP_ROW; FETCH DUP_ROW INTO DUP_CNT; IF DUP_CNT > 1 THEN DELETE FROM table_name WHERE ROWID IN ( SELECT ROW_ID FROM EXCEPTIONS WHERE TABLE_NAME = 'table_name'); END IF; CLOSE DUP_ROW;
END LOOP;
END procedure_name;
This is a delete by rowid, and will run fast.
This may seem more complicated, but this can be run anytime you think you may have duplicates. Just remember to truncate the exceptions table after you are finished.
Note the differences between ROWID AND ROW_ID. I can never remember which is the actual rowid system element associated with a given table and row. But if you describe the exceptions table, if the element is ROW_ID then my script above is correct. other wise just switch ROWID and ROW_ID in the above statement to detele records.
You can also 'select' the rows, 'select count(*)' the rows. Received on Sun Nov 10 1996 - 00:00:00 CET