deleting with a join [message #385083] |
Fri, 06 February 2009 06:35 |
frog9697
Messages: 18 Registered: December 2008
|
Junior Member |
|
|
Hi there,
I was about to execute the following command:
delete from tableA where UniqueID not in (select UniqueID from tableB);
so basically removing all rows from tableA which don't match with an entry in table B. It strikes me that this is not a very good way to execute this command and that using some sort of join would be better. However, I'm getting a bit confused as to the syntax I need to do that. Can anyone help?
Thanks,
F
|
|
|
|
Re: deleting with a join [message #385089 is a reply to message #385083] |
Fri, 06 February 2009 06:42 |
trivendra
Messages: 211 Registered: October 2007 Location: Phoenix
|
Senior Member |
|
|
Yes as suggested by joicejohn
DELETE FROM tablea a
WHERE NOT EXISTS (SELECT NULL
FROM tableb b
WHERE a.uniqueid = b.uniqueid)
Thanks
Trivendra
|
|
|
|
Re: deleting with a join [message #385093 is a reply to message #385083] |
Fri, 06 February 2009 06:56 |
JRowbottom
Messages: 5933 Registered: June 2006 Location: Sunny North Yorkshire, ho...
|
Senior Member |
|
|
It's not a bad way of doing it if the UniqueId column is indexed.
There are two other ways of doing it:
The first (assuming you're on 10g) is to use the DELETE option in the Update clause of a MERGE statement. This could well be faster than the standard delete for large tables.
The second way would be to delete from a view.
Unless you've got your referential integrity constraints set up correctly, you'll have to use the BYPASS_UJVC hint here. If you were using this in an update, you could get into trouble if your view returned the same rows more than once, but with a delete, it's not nearly so important.
Even so, you shouldn't use method 2 until you've read up on and understood what the hint does.
create table test_90 (col_1 number, col_2 number);
create table test_91 (col_1 number);
insert into test_90 values (1,10);
insert into test_90 values (2,10);
insert into test_90 values (3,10);
insert into test_91 values (1);
commit;
merge into test_90 t90
using (select col_1 from test_91) t91
on (t90.col_1 = t91.col_1)
when matched then update set col_2 = null delete where col_1 is not null;
select * from test_90;
rollback;
delete /*+ bypass_ujvc */
(select t90.*
from test_90 t90
,test_91 t91
where t90.col_1 =t91.col_1);
select * from test_90;
|
|
|
|
|
Re: deleting with a join [message #385114 is a reply to message #385083] |
Fri, 06 February 2009 08:02 |
frog9697
Messages: 18 Registered: December 2008
|
Junior Member |
|
|
Hi,
I had a quick go at executing both Trivendra and Michel' ssuggestions. It appears that both will take a very long time, i.e. several days! (despite a unique index on Uniqueid present on both tables). Both tables are very large (tableA is 180m rows and tableB 44m rows. Is there any way I can use a more efficient query.
|
|
|
|
Re: deleting with a join [message #385119 is a reply to message #385100] |
Fri, 06 February 2009 08:20 |
JRowbottom
Messages: 5933 Registered: June 2006 Location: Sunny North Yorkshire, ho...
|
Senior Member |
|
|
Quote: | NEVER use this hint
|
I have to disagree. This hint is used in Oracle published material: Example
Tom Kyte talks about how to use it here
Now, he also says it's on his list of 'Evil, do not use' hints, but he also says Quote: | be careful, as long as the join columns in the source table are unique (they are, the group by does
that), this is "sort of safe", you might find the hint disappears in the future (i'd say unlikely)
|
It exists in Metalink (See Bug 5407528)
I wouldn't use it in production code, but in a non-production environment, I would and have used it several times.
|
|
|
|
|
Re: deleting with a join [message #385132 is a reply to message #385127] |
Fri, 06 February 2009 09:22 |
S.Rajaram
Messages: 1027 Registered: October 2006 Location: United Kingdom
|
Senior Member |
|
|
Operation Object Name Rows Bytes Cost Object Node In/Out PStart PStop
DELETE STATEMENT Optimizer Mode=ALL_ROWS 150 M 608284
DELETE schema.tableA
MERGE JOIN ANTI 150 M 2G 608284
INDEX FULL SCAN schema.uniqueid_indexA 194 M 2G 431349
SORT UNIQUE 44 M 293 M 176935
INDEX FAST FULL SCAN schema.uniqueid_indexB 44 M 293 M 23839
Well according to the optimiser it estimates it is going to delete 150M rows from the 194M row table which is nearly 75% of the data is getting deleted. Is that estimate correct. If not could you please execute the following query along with the results and post it over here.
select count(*) from table_a a, table_b b
where a.unique_id = b.unique_id (+)
and
b.unique_id is null;
Regards
Raj
[Updated on: Fri, 06 February 2009 09:24] Report message to a moderator
|
|
|
Re: deleting with a join [message #385135 is a reply to message #385132] |
Fri, 06 February 2009 09:31 |
frog9697
Messages: 18 Registered: December 2008
|
Junior Member |
|
|
Hi Raj,
Yes, that is correct. I expect that tableA will have 44M rows - thus making it the same size as table B - when the statement has completed.
Regards,
F
|
|
|
Re: deleting with a join [message #385138 is a reply to message #385135] |
Fri, 06 February 2009 09:42 |
JRowbottom
Messages: 5933 Registered: June 2006 Location: Sunny North Yorkshire, ho...
|
Senior Member |
|
|
It would almost certainly be quicker to do this:
1) Create a new table holding the data that you want to keep:CREATE TABLE tablea_temp as select * from tablea where uniqueid in (select uniqueid from tableb);
2) Truncate tableA, (possibly after disabling any FK constraints that point to it
3) Replace the data in tablea:INSERT /*+ append */ into tablea select * from tablea_temp;
4) Drop tablea_temp and re-enable any dropped constraints.
|
|
|
|
Re: deleting with a join [message #385144 is a reply to message #385135] |
Fri, 06 February 2009 10:11 |
S.Rajaram
Messages: 1027 Registered: October 2006 Location: United Kingdom
|
Senior Member |
|
|
If you have license for partitioning I will be tempted to use the exchange partition approach.
Steps will be similar what @JRowbottom as described
Step a) Create a temp partitioned table with single partition and insert the rows you are interested.
Step b) Truncate the source table
Step c) Exchange partition between the temp table and the main table.
Advantage you have is oracle don't need to re-insert 44Million two times. For more information about partition exchange refer the oracle reference manual.
Again as I mentioned earlier check whether you have the license for partitioning.
Hope this helps.
Regards
Raj
|
|
|
|