RE: Deletion from large table

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Thu, 25 Aug 2016 07:32:07 +0000
Message-ID: <CE70217733273F49A8A162EE074F64D90150339757_at_exmbx05.thus.corp>



The most appropriate choice depends so much on the logical and physical constraints. In the case of "get rid of a lot of very old history, but spread the job out over time" there can even be the argument for a simple repeate tablescan with "delete where rownum <= {X}; commit; wait a bit and repeat" because even though you keep tablescanning the stuff that no longer needs deleting you're doing block cleanout on the blocks that changed on the last pass.

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle



From: Martin Berger [martin.a.berger_at_gmail.com] Sent: 25 August 2016 08:13
To: Jonathan Lewis
Cc: JDunn_at_sefas.com; Oracle-L Freelists; Andrew Kerber; Chris Taylor Subject: RE: Deletion from large table

Yes,
The idea was to commit after each deletion. If required wait some time or put it into a job to spread the load from a one-time punch to a smooth distribution.

Martin

^∆x

Am 25.08.2016 09:09 schrieb "Jonathan Lewis" <jonathan_at_jlcomp.demon.co.uk<mailto:jonathan_at_jlcomp.demon.co.uk>>:

Are you going to commit between each delete ?

If not there's the read-consistency problem - maybe addressable by modifying the code and using AS OF SCN; if so then that's what dbms_parallel_execute can do for you.

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle



From: Martin Berger [martin.a.berger_at_gmail.com<mailto:martin.a.berger_at_gmail.com>] Sent: 25 August 2016 08:04
To: Andrew Kerber
Cc: Jonathan Lewis; JDunn_at_sefas.com<mailto:JDunn_at_sefas.com>; Chris Taylor; oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org> Subject: Re: Deletion from large table

In case the single transaction is "to big" for whatever reason, and it needs to spit into chunks (WHERE ROWNUM < 1000000) has anyone experience if it makes sense to "cluster" by blocks? delete from big_table where big_table.rowid in   (select rowid from
    (select rowid, rownum rn

      from big_tablenot where exists (select 1 from small_table s where s.id<http://s.id/> = b.id<http://b.id/>)
      order by rowid)

    where rn < 1000000)

The idea is to modify few table blocks per transaction, and as the available space is clustered as well, new inserts will (hopefully) be in same or adjacent blocks.

I don't recommend doing so, it's more a question about findings, side-effects, ...

Martin

2016-08-23 16:57 GMT+02:00 Andrew Kerber <andrew.kerber_at_gmail.com<mailto:andrew.kerber_at_gmail.com>>: I have generally had good performance with syntax like this:

delete from big_table where id in (select big_table.id<http://big_table.id> from small_table, big_table where small_table.id<http://small_table.id>=big_table.id<http://big_table.id> (+) and small_table.id<http://small_table.id> is null)

On Tue, Aug 23, 2016 at 9:38 AM, Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk<mailto:jonathan_at_jlcomp.demon.co.uk>> wrote:

Best access path does vary with circumstances.

If you're expecting lots of inserts while doing the deletes you may find that as the delete progresses the rate slows down and the volume of undo applied for read-consistency climbs. If you see that as a problem it may be that finding an index that lets you walk the big table in reverse order of data arrival may (slightly counter-intuitively) improve performance.

Under any circumstances deleting by tablescan and deleting by index range scan behave differently with respect to index maintenance (this note on big updates also applies to big deletes: http://jonathanlewis.wordpress.com/2006/11/22/tuning-updates/<https://jonathanlewis.wordpress.com/2006/11/22/tuning-updates/> ).

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle



From: oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org> [oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org>] on behalf of John Dunn [JDunn_at_sefas.com<mailto:JDunn_at_sefas.com>] Sent: 23 August 2016 14:39
To: Chris Taylor
Cc: oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org> Subject: RE: Deletion from large table

Unfortunately it’s a nightly thing….whilst updates are still going on….

John

From: Chris Taylor [mailto:christopherdtaylor1994_at_gmail.com<mailto:christopherdtaylor1994_at_gmail.com>] Sent: 23 August 2016 14:38
To: John Dunn
Cc: oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org> Subject: Re: Deletion from large table

Is this a one time thing, or a regularly occurring thing? (A one time data cleanup versus a nightly routine)

If it's a one time data cleanup (or rarely needed), I'd recommend saving off the rows you want to keep into another table, truncate the big_table and reload the rows from the temporary table you created to save the rows you wanted.

Delete is one of the (if not THE) single most expensive operation you can run in a database (but I'm sure you're aware of that but wanted to mention it).

Chris

On Tue, Aug 23, 2016 at 5:17 AM, John Dunn <JDunn_at_sefas.com<mailto:JDunn_at_sefas.com>> wrote: I need to delete large numbers of rows from a large table based upon whether a record exists in a small table.

I am currently using :

            delete from big_table where not exists (select 1 from small_table s where s.id<http://s.id> = b.id<http://b.id>)"

big_table may have up to 100,000 rows for the same id value. small_table will only have one row per id value

Is there a better way to code this?

John

--
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'

--

Martin Berger           martin.a.berger_at_gmail.com<mailto:martin.a.berger_at_gmail.com>
                                  +43 660 2978929<tel:+436602978929>

_at_martinberx<https://twitter.com/martinberx> http://berxblog.blogspot.com
--
http://www.freelists.org/webpage/oracle-l
Received on Thu Aug 25 2016 - 09:32:07 CEST

Original text of this message