Re: Deleting Large Numbers of Rows w/o Filling up my redo logs
Date: Sun, 21 Dec 2008 10:35:53 -0800 (PST)
On Dec 20, 11:30 am, Pat <pat.ca..._at_service-now.com> wrote:
> I'm managing a server that spins about a million rows of logging
> information a day. This is up from about 200k rows a year ago, or
> about 400k rows last summer.
> Since we didn't want the log table to grow to absurd sizes, we like to
> only keep about 7 days worth of logs. So we've got a nightly job that
> basically does:
> begin transaction
> delete from log_table where timestamp < 7 days ago
> commit transaction
> Historically, this worked fine, although, for obvious reasons, the
> disks spun like rabid badgers while that job was running.
> Recently, as the logs have gotten larger though, the cleanup job has
> been putting so much stuff in the redo logs that the rest of the
> activity on the server is grinding to a halt.
> I changed the job recently to do something along the lines of:
> for i in reverse 7..30
> for j in reverse 1..24
> delete from log where "stamp" < sysdate - i - j/24;
> end loop;
> end loop;
> So I'm deleting 1 hours worth of log at a time, or maybe 50k rows
> instead of 1M.
> In test, that seemed to work better e.g. redo logs seemed healthy and
> I got lots of little commits instead of one mega commit.
> My questions through are:
> 1) I'm putting lots of small transactions into the redo logs instead
> of one mega transaction, but I'm still spinning the same absolute
> amount of redos aren't I? Am I just going to move the problem here so
> that the redo's will be full of lots of little commits and the log
> writer won't be able to spool them off to archive logs mode quickly
> 2) Is there a "better" way to delete a big chunk of rows? If I could
> take the database out of production I know the usual tricks e.g. copy
> the rows I want to keep into a new table, then drop the old one, but
> this is a live table that I can't mess with that way.
> Basically, how are other people deleting large numbers of rows?
Once you get the log down to 7 days data then your task should only be taking out about 1 million rows. Depending on the row length this does not seem like it should be too much of an issue. Delete statements generate undo and undo changes generate redo records so the delete statement is much more expensive than a insert or your average update statement from a redo logging point of view. Compare the total volume of data being deleted (avg_row_len X number of rows) to the size of your online redo logs and make sure the logs are large enough not to need to switch too frequently while the delete runs. Remember to add in the undo space overhead for the table and index undo data.
Some large transactions can gain a performance benefit from breaking the transaction up into smaller transactions since this can result in the transaction itself benefitting in requireing less reads of undo information to create consistent veiws of the data. I would however start testing with a large transaction size perhaps using one every 250,000 rows then trying one every 100,0000 rows etc ... to see if the change had a positive effect.
But if you delete a million rows in one transaction or 10 should not make much of a difference to the total amount of redo log generated. Commiting after every row on the other hand should actually increase the redo being generated since you would have to also log all the commits.
The more I think about the issue I think you might want to take a careful look at your undo statistics during this job.
HTH -- Mark D Powell -- Received on Sun Dec 21 2008 - 12:35:53 CST