RE: FIND DUPLICATE KEYS

From: John Hawkins <johnh_at_nnshh127.uucp>
Date: Wed, 14 Apr 1993 18:12:17 GMT
Message-ID: <C5HJwI.31L_at_news.rich.bnr.ca>


One may select all duplicate values based on the rowid... This variation seems to be pretty fast...

select last_name, rowid
from EMP outer
where rowid < (select max(rowid)

	       from  EMP
	       where last_name = outer.last_name)
John Hawkins

orginal posting

To: ggoodrum_at_sgihbtn.sierra.com (Glenn Goodrum) Subject: Re: Identify duplicate keys
In article <1993Apr5.150607.22307_at_sierra.com>, you write:
|> In article <C4o9C6.8AC_at_da_vinci.it.uswc.uswest.com>, Jeffrey L. DeMent <jld_at_advtech.uswest.com> writes:
|> |> Trying to create an index on an Oracle V6 table -- but it's failing
|> |> because of duplicate keys. Is there a slick/fast/easy way to identify
|> |> the rows with duplicate keys (for subsequent deleting !) ?
|> |>
|> |> Thanks.
|>
|> There have been many replies suggesting SQL queries based on GROUP BY. This is
|> fine if you have only a thousand or so key values. However, if you have hundreds
|> of thousands, then the GROUP BY mechanism will require an enormous amount of
|> temporary sort space.
|>
|> Another way to accomplish the result without any temporary space is as follows:
|>
|> SELECT A.ROWID,A.KEY,B.ROWID
|> FROM [tablename] A, [tablename] B
|> WHERE A.key-1 = B.key-1
|> ...
|> AND A.key-n = B.key-n
|> AND A.ROWID != B.ROWID
|>
|> Note that this does depend on the existence of a non-unique index on at least one
|> key column in order to run in a reasonable amount of time.
|>
|> Glenn Goodrum
|> Sierra Geophysics
|> ggoodrum_at_sierra.com

-- 
John Hawkins       || #define disclaimer(X) fprintf(stderr," X\n");
Northern Telecom   || Voice: 615-734-4468                 /* Able was I, ere */
200 Athens Way     || Fax:   615-734-4771                 /* I saw Elba      */
Nashville TN 37728 || Internet:  nvjhh01_at_NT.COM           /* Who?            */
Received on Wed Apr 14 1993 - 20:12:17 CEST

Original text of this message