Home » SQL & PL/SQL » SQL & PL/SQL » Deadlock on delete from same table
Deadlock on delete from same table [message #214981] |
Thu, 18 January 2007 17:41  |
smartin
Messages: 1803 Registered: March 2005 Location: Jacksonville, Florida
|
Senior Member |
|
|
I'm troubleshooting a deadlock situation.
There are two jobs that both delete records from the same table, for example: delete from x where rownum <= 10000. Both of these jobs are scheduled to run at the top of every hour, and both delete with the same exact statement (including same rownum limit).
This table in question has no indexes, no constraints, no updates. It is a standalone table. Also, it is a live table with records added to it one at a time by countless other sessions at the same time that deletes can fire.
I'm thinking these two deletes must be interfering with each other, and indeed one of them shows up in the trace file for the deadlock. I'm trying to prove this by simulating a test case, but am failing to reproduce it.
Can someone show me how to simulate a deadlock by deleting from the same table with two sessions?
Attached is my (apparently poor) attempt.
It should also be noted that this table in questions, which has rows constantly added to it, is populated via a trigger. So presumably the deadlock could be caused not by the two deletes conflicting with each other, but by one of the deletes conflicting with the poupulation of the table. I have not yet included the triggers in my simulation attempt.
[Updated on: Thu, 18 January 2007 17:43] Report message to a moderator
|
|
|
Re: Deadlock on delete from same table [message #214990 is a reply to message #214981] |
Thu, 18 January 2007 20:47   |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
It seems unlikely. Both jobs are going to read the table in the same sequence. So, assuming Job1 does not commit before job2 starts, they will try to delete the same rows.
Each job will try to lock Row 1 first, one of them will get the lock, the other will wait. This means that job1 will be free to lock the other 49999 rows. Once job1 commits, job2 should be free to delete the next 50000.
The only way this could go wrong is if they selected in a different order. This is not impossible of course, just highly unlikely.
Post the deadlock trace.
Or, just serialise the jobs with DBMS_LOCK:
DECLARE
lHandle VARCHAR2(128);
lStatus INTEGER;
BEGIN
dbms_lock.allocate_unique('MY_LOCK_NAME', lHandle);
lStatus := dbms_lock.request(lHandle, release_on_commit=>TRUE);
IF lStatus <> 0 THEN
raise_application_error (-20000, 'DBMS_LOCK.REQUEST returned with error status ' || lStatus);
END IF;
do_delete();
commit;
END;
The advantage of this over LOCK TABLE IN EXCLUSIVE MODE is that it allows other non-delete processes to continue to function. Of course, if it is these other processes causing the deadlock, you're back to square 1.
Ross Leishman
|
|
|
Re: Deadlock on delete from same table [message #215009 is a reply to message #214990] |
Thu, 18 January 2007 22:28   |
smartin
Messages: 1803 Registered: March 2005 Location: Jacksonville, Florida
|
Senior Member |
|
|
Hey Ross, thanks for the thoughts.
At the moment I'm not looking at how to prevent the problem (or in a position to do so just yet). I'm in a situation where it would best to identify and prove exactly what the issue is.
> It seems unlikely. Both jobs are going to read the table in the same sequence.
That was my thought as well. But I was wondering if the inserts that are happening in separate transactions at the same time, and filling in gaps in the blocks, would get in there and cause the order to be different depending on the exact sequence.
I still haven't come up with a definitive argument/reason for myself as to why, hence my attempt to construct a test case.
Is delete from table where rownum <= x necessarily in a guaranteed order? I would think not, as select * from table where rownum <= x is not, unless there is an explicit order by.
A roadblock I'm facing is that I don't see anything else anywhere in the code that has even a chance of causing a deadlock. Nothing ever selects or updates the table. These two jobs are the only thing that delete from it.
The only thing that inserts into it is a trigger on a source table (this table is kind of like an audit table). There is no PK or any other constraint or index. It is a standard heap table.
As far as the deletes, their intention was to drain the table completely. However, it is one of these delete it in a bunch of chunks situations, where they loop through deleting so many rows at a time until the table is empty.
One other bit of info is that this is happening in an 8i database. I'm not in a position yet (but hopefully will be by the end of tomorrow) to actually log into the database in question (or any other 8i db) and perhaps run a test case. So my testing attempts have been on my 10.2 pc db (which, come to think of it, I should do my 10g test in a different tablespace, one that doesn't use ASSM just to rule that difference out).
Another roadblock is that I have not been given the full deadlock trace file, just a snippet from it. I'm not sure if I'll be allowed to see the full file or not. However, the delete function does have a log routine of some kind which I hopefully will get to check out the results of tomorrow (I'm writing this at 11pm my time).
Any and all thoughts / guesses / suggestions / scenarios / test cases / references are certainly welcome.
|
|
|
Re: Deadlock on delete from same table [message #215022 is a reply to message #215009] |
Fri, 19 January 2007 00:21   |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
The reason I wanted to see the deadlock trace is because I don't think you are experiencing a "normal" deadlock. ie. Not based on exclusive row locks.
I suspect you have an Enqueue deadlock. If you look at the deadlock trace for a row-exclusive deadlock, both the holder and the waiter are of the same type (RX, or SSX, or something). Enqueue deadlocks usually manifest as the holder having a different type of lock (SX?). Don't quote me on these lock types.
My DBA tried to explain enqueues to me, and I was left scratching my head a bit, so God knows how you're going to go if you don't already understand them.
The way I rationalised his explanation in my brain (which may have no correlation to real-life) is that Enqueues are like tickets at the deli - you can't get served without a ticket. Similarly, Oracle won't update a block unless you have a ticket (enqueue). But the more you update, the more tickets you need, and they are only released upon commit.
If several high-DML processes are running concurrently, and you run out of Enqueues. Job1 puts a wait on one of Job2's enqueues, and Job2 does the same to Job1. Deadlock.
I know of no way to avoid this, other than getting the DBA to increase the pool of enqueues.
Ross Leishman
|
|
|
|
Re: Deadlock on delete from same table [message #215220 is a reply to message #215176] |
Fri, 19 January 2007 21:24   |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
Thank-you - a practical demonstration of exactly how poorly I understood the problem. 
Point is, there are many different types of enqueues. I might be talking about one type and suffering from a problem with another type.
I cannot give you my blood oath that the problem I had was an enqueue deadlock - I am relying on my interpretation of the diagnosis from the DBA.
Actually I think the deadlock was somewhat more subtle than I indicated. It may have been row-enqueue, rather than enqueue-enqueue as I implied above.
But here's what I can absolutely rock-solid, mother's-grave, guarantee you: the key was in the lock-type profile of the deadlock. The combination of lock types was unique! I tried to replicate it with row updates, foreign key updates, lock table in XXX mode, select for update, and every combination of the above; NOTHING produced the same signiture as the deadlock that happened in the application. My proof of this is that the DBA actually helped me (it was a battle)! Have you EVER met a DBA who would give serious consideration to a deadlock other than to say it was "an application problem"? Exactly! Only when faced with a sheaf of printouts of every conceivable application deadlock was he convinced that mine was different. He increased the enqueues (whatever that is!?) and the problem went away.
Ross Leishman
|
|
|
|
Re: Deadlock on delete from same table [message #215223 is a reply to message #215221] |
Fri, 19 January 2007 22:08   |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
Quote: | I'll stop beating this dead horse with this post.
I just visited asktom.oracle.com & did a keyword search on "enqueue deadlock" & got the following response:
No questions found matching search criteria.
|
How condescending!
In the same vain, I did a search - in Google. Top hit was this.
I clearly don't know everything, and neither does AskTom. I wonder if we're not the only ones?
This link suffers from exactly the same lock-profile I was talking about: X blockers and S waiters.
Ross Leishman
|
|
|
Re: Deadlock on delete from same table [message #215231 is a reply to message #214981] |
Fri, 19 January 2007 23:38   |
smartin
Messages: 1803 Registered: March 2005 Location: Jacksonville, Florida
|
Senior Member |
|
|
As an update on my deadlock issue, I was not able to gain access to the deadlock trace file. It had apparently been deleted very shortly after generation 
Also, I haven't made any further progress on my full blown simulation. However, I did learn that it is quite easy to achieve a deadlock by deleting from the same table.
My example is at work, but just create a table with some rows, delete where rownum <= something, do not commit or rollback, then start an autonomous transaction and issue a similar delete to the table. Kaboom.
I also learned in reading that ITL issues were more pronounced (but also easier to simulate) in earlier versions, such as 8i, than they are now in 10g. In fact, you can't even specify maxtrans on a table now (well you can, but alert log says it is ignored and dba_tables shows 255).
Oh and I also learned that at the time the problem happened, there was an unusual process running on the DB that generated roughly 100 (yes, hundred) times the normal activity.
|
|
|
Re: Deadlock on delete from same table [message #215310 is a reply to message #215231] |
Sat, 20 January 2007 19:43   |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
smartin wrote on Sat, 20 January 2007 16:38 | My example is at work, but just create a table with some rows, delete where rownum <= something, do not commit or rollback, then start an autonomous transaction and issue a similar delete to the table. Kaboom.
|
Huh! I would have bet against that. Deadlock means both sessions are waiting, but session 1 has completed its transactions - it is just uncommitted. There's probably an explanation for it in Autonomous Transactions. I should read-up on them, but I just can't find a need for them in Data Warehousing.
You are spot on with the ITL problem - it was in 7.3 that it happened to me.
Ross Leishman
|
|
|
Re: Deadlock on delete from same table [message #247517 is a reply to message #215310] |
Tue, 26 June 2007 03:34   |
rleishman
Messages: 3728 Registered: October 2005 Location: Melbourne, Australia
|
Senior Member |
|
|
I know it's a bit late, but one of my fav. bloggers has just created an excellent post on non-standard deadlocks that I believe explains the ITL problem I inadequately described above.
See here.
Ross Leishman
|
|
|
|
Goto Forum:
Current Time: Mon Jul 21 09:45:59 CDT 2025
|