Re: blocking enq-ss wait events

From: Lok P <loknath.73_at_gmail.com>
Date: Thu, 6 May 2021 23:44:09 +0530
Message-ID: <CAKna9Vanc-7ckSaDqsAdR9MrFcq=UTq_Cn8GxRbZ0T3DuxO_Qw_at_mail.gmail.com>



Thanks a lot. So here is what happened and I am still struggling to clearly understand how logically these are related. We had killed one long running transaction as it was reading from UNDO from a long time(~8+hrs), that resulted in a rollback and it was keep going and we use to see wait event "wait for a undo record" from multiple SYS sessions (which most probably SMON doing the cleanup in multiple parallel slaves). But till that time we were all okay because other application queries/sessions and everything was going fine and that was not blocking anyone.

Then when above was going on the infra team did another planned activity in which the database had to be rebooted. They did that and brought the database back online, and after this we started seeing the same "wait for undo record" wait event and thought it may be that SMON is resuming its rollback/cleanup and it should not impact other application queries. But then suddenly it appears "enq:ss contention" for multiple application sessions and the blocking session was waiting on "sort segment request" and those application queries were just stuck.

I am trying to understand why even the database shutdown happened and it started up seamlessly but the old rollback/cleanup reinitiated again by the SMON. And even then, why after the DB reboot , the application session was stuck on "sort segment request" causing other sessions to hang in "enq-ss contention"? How is it related to the big rollback?

Regards
Lok

On Thu, May 6, 2021 at 7:02 PM Chris Taylor < christopherdtaylor1994_at_gmail.com> wrote:

> No killing the recovery process isn't a great idea as I'm fairly certain
> SMON is involved which is a critical component of the database operation.
> Kill it, you kill the instance.
>
> Are you CPUs on the server very busy, very low idle ?
>
> Find SID & SERIAL for the SMON process and query GV$PX_SESSION for
> QCSID=<sid of SMON> and QCSERIAL# = <serial# of SMON> and see how many
> parallel server processes its using.
>
> If you kill (or have to restart the database) be aware that the startup of
> the database will "pause" while SMON cleans up that dead transaction before
> it opens the database (most likely) but SMON should do that in parallel and
> might be your best bet to get the database back to normal operating
> procedure. Though it might take a while for SMON to finish and thus delay
> the opening of the database which you can monitor by tailing the alert log
> file if you do kill & restart the database instance.
>
> Chris
>
>
>
>
>
> On Thu, May 6, 2021 at 1:14 AM Lok P <loknath.73_at_gmail.com> wrote:
>
>> Thank you . It seems to match our symptoms. (Big Rollback causing
>> enq-ss contention)
>>
>> We were thinking about killing the system processes which are trying to
>> perform the rollback as we don't need that job anymore. Is that safe?
>> Or we have to either wait to let that finish and the new transactions may
>> move on by increasing the pga_aggregate_target as suggested in the note.
>> else drop and create temp tablespace.
>>
>> On Thu, May 6, 2021 at 10:13 AM Chris Taylor <
>> christopherdtaylor1994_at_gmail.com> wrote:
>>
>>> I had to look up ENQ-SS on Oracle support. That's a sort segment
>>> contention, usually encountered when SMON is really busy cleaning up a dead
>>> / killed transaction.
>>>
>>> Oracle support has a few notes on this but the most applicable seem to
>>> stop at 11.1 .
>>>
>>> There is one note that says to try to increase PGA_AGGREGATE_TARGET
>>> SS Sort Segment Enqueue: 'enq: SS - contention' (Doc ID 2601825.1)
>>>
>>> Chris
>>>
>>> On Wed, May 5, 2021 at 11:56 PM Lok P <loknath.73_at_gmail.com> wrote:
>>>
>>>> Hi All, Need some help, we had killed one of the long running sessions
>>>> which was running since ~8hrs+ , but after we kill that we see a lot of
>>>> "wait for a undo record" wait events but we ignored that thinking that will
>>>> run for sometime as because it will do rollback. But suddenly now we are
>>>> seeing "enq-SS contention" wait event in addition to "wait for a undo
>>>> record" and that is blocking all other application queries. So wondering
>>>> how we should mitigate this issue?
>>>>
>>>> The version is 11.2.0.4 of Exadata.
>>>>
>>>> Regards
>>>> Lok
>>>>
>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu May 06 2021 - 20:14:09 CEST

Original text of this message