RE: Deletion from large table

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Tue, 23 Aug 2016 15:31:43 +0000
Message-ID: <CE70217733273F49A8A162EE074F64D90150339439_at_exmbx05.thus.corp>



Andrew,

I think that might be a good idea for certain data patterns, index strategies and versions of Oracle (I think you've got the (+) on the wrong side of the join, by the way), but newer versions of the optimizer are capable of turning an outer join into an anti join, and with the quick and dirty test I ran I got the following execution paths (T1 is the big table, t2 is the small):

Original query: "not in (subquery)"



| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |


| 0 | DELETE STATEMENT | | | | 126 (100)| |
| 1 | DELETE | T1 | | | | |
|*  2 |   HASH JOIN RIGHT ANTI|      | 47541 |  1114K|   126   (5)| 00:00:01 |

| 3 | TABLE ACCESS FULL | T2 | 6 | 54 | 2 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | T1 | 63388 | 928K| 122 (4)| 00:00:01 |

Your query: "in (outer join subquery)"



| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |


| 0 | DELETE STATEMENT | | | | 259 (100)| |
| 1 | DELETE | T1 | | | | |
|*  2 |   HASH JOIN              |          |   154K|  3913K|   259   (9)| 00:00:02 |

| 3 | VIEW | VW_NSO_1 | 6 | 66 | 134 (11)| 00:00:01 |
| 4 | SORT UNIQUE | | 6 | 114 | 134 (11)| 00:00:01 |
|* 5 | HASH JOIN RIGHT ANTI| | 47541 | 882K| 126 (5)| 00:00:01 |
| 6 | TABLE ACCESS FULL | T2 | 6 | 54 | 2 (0)| 00:00:01 |
| 7 | TABLE ACCESS FULL | T1 | 63388 | 619K| 122 (4)| 00:00:01 |
| 8 | TABLE ACCESS FULL | T1 | 63388 | 928K| 122 (4)| 00:00:01 |
------------------------------------------------------------------------------------- Regards

Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle

From: Andrew Kerber [andrew.kerber_at_gmail.com] Sent: 23 August 2016 15:57
To: Jonathan Lewis
Cc: JDunn_at_sefas.com; Chris Taylor; oracle-l_at_freelists.org Subject: Re: Deletion from large table

I have generally had good performance with syntax like this:

delete from big_table where id in (select big_table.id<http://big_table.id> from small_table, big_table where small_table.id<http://small_table.id>=big_table.id<http://big_table.id> (+) and small_table.id<http://small_table.id> is null)

On Tue, Aug 23, 2016 at 9:38 AM, Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk<mailto:jonathan_at_jlcomp.demon.co.uk>> wrote:

Best access path does vary with circumstances.

If you're expecting lots of inserts while doing the deletes you may find that as the delete progresses the rate slows down and the volume of undo applied for read-consistency climbs. If you see that as a problem it may be that finding an index that lets you walk the big table in reverse order of data arrival may (slightly counter-intuitively) improve performance.

Under any circumstances deleting by tablescan and deleting by index range scan behave differently with respect to index maintenance (this note on big updates also applies to big deletes: http://jonathanlewis.wordpress.com/2006/11/22/tuning-updates/<https://jonathanlewis.wordpress.com/2006/11/22/tuning-updates/> ).

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle



From: oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org> [oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org>] on behalf of John Dunn [JDunn_at_sefas.com<mailto:JDunn_at_sefas.com>] Sent: 23 August 2016 14:39
To: Chris Taylor
Cc: oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org> Subject: RE: Deletion from large table

Unfortunately it’s a nightly thing….whilst updates are still going on….

John

From: Chris Taylor [mailto:christopherdtaylor1994_at_gmail.com<mailto:christopherdtaylor1994_at_gmail.com>] Sent: 23 August 2016 14:38
To: John Dunn
Cc: oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org> Subject: Re: Deletion from large table

Is this a one time thing, or a regularly occurring thing? (A one time data cleanup versus a nightly routine)

If it's a one time data cleanup (or rarely needed), I'd recommend saving off the rows you want to keep into another table, truncate the big_table and reload the rows from the temporary table you created to save the rows you wanted.

Delete is one of the (if not THE) single most expensive operation you can run in a database (but I'm sure you're aware of that but wanted to mention it).

Chris

On Tue, Aug 23, 2016 at 5:17 AM, John Dunn <JDunn_at_sefas.com<mailto:JDunn_at_sefas.com>> wrote: I need to delete large numbers of rows from a large table based upon whether a record exists in a small table.

I am currently using :

            delete from big_table where not exists (select 1 from small_table s where s.id<http://s.id> = b.id<http://b.id>)"

big_table may have up to 100,000 rows for the same id value. small_table will only have one row per id value

Is there a better way to code this?

John

--
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Aug 23 2016 - 17:31:43 CEST

Original text of this message