Re: duplicate rows in big table

From: Steve McDaniels <steve.mcdaniels_at_sierra.com>
Date: Wed, 30 Aug 2000 14:57:41 -0700
Message-ID: <8ojvsv$ap$1@spiney.sierra.com>

On a 17M row table, I use this strategy:

create an exceptions table:

create table exceptions
(row_id rowid,
owner varchar2(30),
table_name varchar2(30),
constraint varchar2(30));

alter table T add constraint pk_T primary key (KEY_FIELD) using index tablespace ...
storage (...)
exceptions into EXCEPTIONS;

This will unsuccessfully try to implement the constraint and will put offending rowids in EXCEPTIONS.

Notice that this is for a primary key constraint. The same business works for ANY CONSTRAINT, so, alter table T add constraint ..... unique(field1,field2,...) exceptions into...
will produce a list of offending rowids.

This strategy works well when the number of offending rows is small. You must stil go through the EXCEPTIONS table to determine which of the two offending rows should be saved.

Then, delete from T where T.rowid in (select row_id from EXCPETIONS) or
create a simple PROCEDURE with a cursor which steps through EXCEPTIONS deleting rows from T.

good luck.

"James" <jialong.x.xie_at_boeing.com> wrote in message news:G04B63.4HD_at_news.boeing.com...
> I have a table, which has more than one million rows but a few duplicates.
> Is there any quick way to single out these duplicates? The normal way that
> compares rowid by rowid is too time consuming. Any help will be
appreciated.
>
> James
>
>
Received on Wed Aug 30 2000 - 16:57:41 CDT