Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: help with deleting duplicate records from very large table

RE: help with deleting duplicate records from very large table

From: Seley, Linda <LSeley_at_IQNavigator.com>
Date: Wed, 12 Sep 2001 16:09:44 -0700
Message-ID: <F001.0038C7F4.20010912162035@fatcity.com>

Suhen
-
<SPAN
class=157215522-12092001> 
<SPAN
class=157215522-12092001>Wouldn't it be more efficient to have something like:
<SPAN
class=157215522-12092001> 
delete
from invaudee i1
 
where rowid not in
<SPAN
class=157215522-12092001>    (select min(rowid)

<SPAN
class=157215522-12092001>    from invaudee i2

<SPAN
class=157215522-12092001>    where i1.unique_value = i2.unique_value);
<SPAN
class=157215522-12092001> 
You
have to do a full table scan of the table as part of the delete.  With the query I proposed, if you have an index on invaudee then it should be used to do a range scan for the rowid.  If you use the inner select you have it will do a full table scan for every row in invaudee. <SPAN
class=157215522-12092001> 
I just
tried it on one of my tables, here are my explain plans : <SPAN
class=157215522-12092001> 
for
delete from test_table t1    where rowid not in      (select
min(rowid)      from test_table
t2      group by
unique_value_1,unique_value_2)
<SPAN
class=157215522-12092001> 
<SPAN

class=157215522-12092001>EXPLAIN_PLAN                                                                                                             
OPT------------------------------------------------------------------------------------------------------------------------ 
------1.0 DELETE STATEMENT    1 Cost=
n/a                                                                                      
CHOOSE  2.1 DELETE  TEST_TABLE 
TEST_TABLE    3.1 FILTER     
4.1 TABLE ACCESS FULL TEST_TABLE 
TEST_TABLE      4.2 
FILTER        5.1 SORT GROUP 
BY          6.1 TABLE ACCESS 

FULL TEST_TABLE  TEST_TABLE
 
7 rows
selected.
<SPAN
class=157215522-12092001> 
for
delete from test_table t1    where rowid not in      (select
min(rowid)      from test_table
t2      where t2.unique_value_1 =
t1.unique_value_1)/
<SPAN
class=157215522-12092001> 
<SPAN
class=157215522-12092001>EXPLAIN_PLAN                                                                                                             
OPT------------------------------------------------------------------------------------------------------------------------ 
------1.0 DELETE STATEMENT    3 Cost=
n/a                                                                                      
CHOOSE  2.1 DELETE  TEST_TABLE 
TEST_TABLE    3.1 FILTER     
4.1 TABLE ACCESS FULL TEST_TABLE 
TEST_TABLE      4.2 SORT
AGGREGATE        5.1 INDEX RANGE SCAN
TEST_TABLE_N1 NON-UNIQUE TEST_TABLE_N1
 
6 rows
selected.
<SPAN
class=157215522-12092001> 
<SPAN
class=157215522-12092001>HTH
<SPAN
class=157215522-12092001> 
<SPAN
class=157215522-12092001>Linda
<SPAN
class=157215522-12092001> 
<SPAN
class=157215522-12092001>-----Original
Message-----From: Suhen Pather
[mailto:Suhen.Pather_at_strandbags.com.au]Sent: Tuesday, September 11, 2001 10:50 PMTo: Multiple recipients of list ORACLE-LSubject: help with deleting duplicate records from very large table   

<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">List,
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt"> 
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">I need to delete duplicate records   from a very large table (60 millions records +).
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">There would be about 3 million   duplicate entries.
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt"> 
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">What is the quickest way to do   this?
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt"> 
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">The syntax that I am using is   

<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">delete<FONT   face=Arial> from
  invaudee
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">where<FONT   face=Arial> rowid not in (select min(rowid)   from invaudee
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">group<FONT   face=Arial> by
  audit_number);
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt"> 
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">This is taking a long time to run.   I cannot see any entries in v$transaction
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">for the   delete.
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt"> 
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">There is no   indexes on the INVAUDEE table.
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">I created an index on the primary   key column but it still takes forever to run.
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt"> 
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">I do not have the space to   CTAS.
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt"> 
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">Or should I write the duplicates   to an EXCEPTIONS table and perform the delete based on the entries   

<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">in<FONT   face=Arial> the EXCEPTIONS
  table.
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt"> 
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">Any help would be greatly   appreciated.
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt"> 
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">Suhen
<SPAN

  style="FONT-FAMILY: Arial; FONT-SIZE: 10pt">  Received on Wed Sep 12 2001 - 18:09:44 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US