RE: help with deleting duplicate records from very large table

From: Seley, Linda <LSeley_at_IQNavigator.com>
Date: Wed, 12 Sep 2001 16:09:44 -0700
Message-ID: <F001.0038C7F4.20010912162035@fatcity.com>

Suhen
-

Wouldn't it be more efficient to have something like:

delete
from invaudee i1

where rowid not in
 (select min(rowid)

from invaudee i2

where i1.unique_value = i2.unique_value);

You
have to do a full table scan of the table as part of the delete. With the query I proposed, if you have an index on invaudee then it should be used to do a range scan for the rowid. If you use the inner select you have it will do a full table scan for every row in invaudee. 
I just
tried it on one of my tables, here are my explain plans : 
for
delete from test_table t1 where rowid not in (select
min(rowid) from test_table
t2 group by
unique_value_1,unique_value_2)

<SPAN

class=157215522-12092001>EXPLAIN_PLAN                                                                                                             
OPT------------------------------------------------------------------------------------------------------------------------

------1.0 DELETE STATEMENT 1 Cost=

n/a

CHOOSE 2.1 DELETE TEST_TABLE
TEST_TABLE 3.1 FILTER
4.1 TABLE ACCESS FULL TEST_TABLE

TEST_TABLE      4.2 
FILTER        5.1 SORT GROUP 
BY          6.1 TABLE ACCESS

FULL TEST_TABLE TEST_TABLE

7 rows
selected.

for
delete from test_table t1 where rowid not in (select
min(rowid) from test_table
t2 where t2.unique_value_1 =
t1.unique_value_1)/

<SPAN

class=157215522-12092001>EXPLAIN_PLAN                                                                                                             
OPT------------------------------------------------------------------------------------------------------------------------

------1.0 DELETE STATEMENT 3 Cost=

n/a

CHOOSE 2.1 DELETE TEST_TABLE
TEST_TABLE 3.1 FILTER
4.1 TABLE ACCESS FULL TEST_TABLE
TEST_TABLE 4.2 SORT
AGGREGATE 5.1 INDEX RANGE SCAN
TEST_TABLE_N1 NON-UNIQUE TEST_TABLE_N1

6 rows
selected.

HTH

Linda

-----Original
Message-----From: Suhen Pather
[mailto:Suhen.Pather_at_strandbags.com.au]Sent: Tuesday, September 11, 2001 10:50 PMTo: Multiple recipients of list ORACLE-LSubject: help with deleting duplicate records from very large table

<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">List,
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">I need to delete duplicate records from a very large table (60 millions records +).
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">There would be about 3 million duplicate entries.
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">What is the quickest way to do this?
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">The syntax that I am using is

<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">delete from
 invaudee
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">where rowid not in (select min(rowid) from invaudee
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">group by
 audit_number);
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">This is taking a long time to run. I cannot see any entries in v$transaction
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">for the delete.
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">There is no indexes on the INVAUDEE table.
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">I created an index on the primary key column but it still takes forever to run.
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">I do not have the space to CTAS.
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">Or should I write the duplicates to an EXCEPTIONS table and perform the delete based on the entries

<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">in the EXCEPTIONS
 table.
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">Any help would be greatly appreciated.
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">Suhen
<SPAN

 style="FONT-FAMILY: Arial; FONT-SIZE: 10pt"> Received on Wed Sep 12 2001 - 18:09:44 CDT