Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Deleting duplicates from 11 million records

Re: Deleting duplicates from 11 million records

From: Robert Prendin <rprendin_at_magi.com>
Date: Fri, 05 Jun 1998 00:14:24 GMT
Message-ID: <357736df.26577907@news.istar.ca>


lock table <table_name> in exclusive mode nowait;

delete <table_name a> where rowid in (select max(rowid) from <sametable_name b>
where b.dup_col1 = a.dup_col1

and    b.dup_col2  =  a.dup_col2
and    b.dup_col3  =  a.dup_col3
and    b.dup_col4  =  a.dup_col4);

obviously you have to define which columns make up your duplicate records...

this will remove all duplicate records, if you have more than 2 identical records, you will need to run it more than once... If possible, do this during some down time and bring a large rollback segment on line and use it by issuing the following before the delete statement:

 set transaction use rollback segment <big_rollback_segment_name>;

AND just in case, take an export of the table as well..

Good Luck,

Robert Prendin

"Suresh Bhat" <suresh.bhat_at_mitchell-energy.com> wrote:

>I have to delete duplicates from a 3 column table that has 11 million
>records.
>The table has index on columns TAG_DATE and TAG_NAME. The data in the
>table is for the past 60 days.
>
>The criterion for duplicates is having same TAG_DATE with timestamp and
>TAG_NAME.
>
>I am trying to use the time honored delete duplicates SQL script given
>below, but
>I have to do it by date ranges and it is too slow which has always been the
>case
>with this script.
>
>set doc off wrap on pause off
>/*
>The commented select statement below was for testing to make sure
>that I do have duplicates.
>select tag_name, to_date(tag_date,'dd-mon-yy hh24:mi:ss'), tag_value
>*/
>delete
> from T outer
> where rowid > (select min(rowid)
> from T
> where tag_name = outer.tag_name
> and tag_date = outer.tag_date
> )
> and tag_date between to_date('01-may-98 00:00:00','dd-mon-yy
>hh24:mi:ss')
> and to_date('02-may-98
>00:00:00','dd-mon-yy hh24:mi:ss')
>/
>
>Is there another faster way to do this either via SQL or PL/SQL. I would
>appreciate
>posting of any scripts you may have.
>
>Thank you very much.
>
>
>suresh.bhat_at_mitchell-energy.com
>
>
>
Received on Thu Jun 04 1998 - 19:14:24 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US