Re: INSERT statement restarts internally

From: William Robertson <william_at_williamrobertson.net>
Date: Wed, 5 Aug 2020 10:19:34 +0100
Message-Id: <0EB76676-7CB8-4CC3-A425-1B272A075842_at_williamrobertson.net>



Well, that’s the kind of smoking gun I am looking for. The statement involved half a dozen tables, mostly partitioned, some subpartitioned, so I was hoping for a quick way to track it down, but I suppose the timestamps in dba_objects and dba_tab_stats_history might be a start.

No automatic/interval partitioning, no triggers, I’d be surprised if there were force-invalidate stats operations (six of them) but you never know, and this insert was serial and conventional.

Thanks for the links - I’d come across the one from Randolf Geist one before, and it describes the situation exactly. I see he wasn’t sure at the time of writing whether fine-grained cursor invalidation might prevent it in 12.2. It doesn’t. I’ll have to read the Kun Sun post a couple more times to get my head around the scenario with undo. Does it seem likely in an insert?

We also get regular ‘Object no longer exists’ errors that are similarly hard to track down. I wonder if that is a related issue.

William

On 5 Aug 2020, at 00:42, Sayan Malakshinov <xt.and.r_at_gmail.com> wrote:

Hi William,

Were there any DDL operations on that table? Drop/add? Automatic interval partitioning? Row-level 'before insert' triggers? Was it parallel insert? Undo space allocation? Some notes:
 https://oracle-randolf.blogspot.com/2016/01/dml-operations-on-partitioned-tables.html <https://oracle-randolf.blogspot.com/2016/01/dml-operations-on-partitioned-tables.html> http://ksun-oracle.blogspot.com/2011/05/update-restart-and-new-undo-extent.html <http://ksun-oracle.blogspot.com/2011/05/update-restart-and-new-undo-extent.html>

On Wed, Aug 5, 2020 at 1:29 AM <dimensional.dba_at_comcast.net <mailto:dimensional.dba_at_comcast.net>> wrote: While it is running, what is it waiting on?

-----Original Message-----

From: oracle-l-bounce_at_freelists.org <mailto:oracle-l-bounce_at_freelists.org> <oracle-l-bounce_at_freelists.org <mailto:oracle-l-bounce_at_freelists.org>> On Behalf Of William Robertson Sent: Tuesday, August 4, 2020 3:19 PM
To: oracle-l <oracle-l_at_freelists.org <mailto:oracle-l_at_freelists.org>> Subject: INSERT statement restarts internally

Hi all,

I’m trying to get to the bottom of an issue where a SQL statement (in this case an insert) that normally runs in 15 minutes suddenly takes two hours. Checking the session, it’s not blocked, the plan hasn’t changed and seems fine, all row estimates are good, there is no more data than usual, backups aren’t running (though I can’t rule out other background resource hogs, as I have somewhat limited access). Then I notice that the start time shown in SQL Monitor has changed, and going back to the session I see there have been six sql_exec_ids and sql_exec_starts for the sid/serial#/sql_id, even though the procedure only calls it once. Some of them are a couple of minutes apart, some are after 40 minutes. (With hindsight I should have looked at v$sql.object_status and also checked whether the session was rolling back, which I’m guessing it was.)

The last time we saw this, it coincided with a job that gathered stats on all partitions marked as stale, passing no_invalidate = false. We fixed that by changing it to true and the issue seemed to go away. This time though, this stats job wasn’t running and neither was anything else I could see that would have affected the same tables and partitions - all our large tables are list-partitioned by business date and many are subpartitioned by business line etc, partly to isolate batch processes. Looking at dba_active_sess_history I can see it’s happened before but sporadically.

I’m a bit stumped about what to do about it. Would this be logged anywhere (similar to a deadlock report)? I couldn’t see anything likely in v$diag_trace_file_contents but maybe I’m looking for the wrong thing. Is it the case that another session must have hard-invalidated the cursor, and I just have to find the smoking gun, or is there some other scenario like an internal failure? This is 12.2.0.1.

Thanks,

William

--

http://www.freelists.org/webpage/oracle-l Received on Wed Aug 05 2020 - 11:19:34 CEST

Original text of this message