Re: blocking enq-ss wait events
Date: Thu, 6 May 2021 23:44:09 +0530
Message-ID: <CAKna9Vanc-7ckSaDqsAdR9MrFcq=UTq_Cn8GxRbZ0T3DuxO_Qw_at_mail.gmail.com>
Thanks a lot. So here is what happened and I am still struggling to clearly
understand how logically these are related. We had killed one long running
transaction as it was reading from UNDO from a long time(~8+hrs), that
resulted in a rollback and it was keep going and we use to see wait event
"wait for a undo record" from multiple SYS sessions (which most probably
SMON doing the cleanup in multiple parallel slaves). But till that time we
were all okay because other application queries/sessions and everything was
going fine and that was not blocking anyone.
Then when above was going on the infra team did another planned activity in
which the database had to be rebooted. They did that and brought the
database back online, and after this we started seeing the same "wait for
undo record" wait event and thought it may be that SMON is resuming its
rollback/cleanup and it should not impact other application queries. But
then suddenly it appears "enq:ss contention" for multiple application
sessions and the blocking session was waiting on "sort segment request" and
those application queries were just stuck.
I am trying to understand why even the database shutdown happened and it
started up seamlessly but the old rollback/cleanup reinitiated again by the
SMON. And even then, why after the DB reboot , the application session was
stuck on "sort segment request" causing other sessions to hang in "enq-ss
contention"? How is it related to the big rollback?
Regards
On Thu, May 6, 2021 at 7:02 PM Chris Taylor <
christopherdtaylor1994_at_gmail.com> wrote:
> No killing the recovery process isn't a great idea as I'm fairly certain
Lok
> SMON is involved which is a critical component of the database operation.
> Kill it, you kill the instance.
>
> Are you CPUs on the server very busy, very low idle ?
>
> Find SID & SERIAL for the SMON process and query GV$PX_SESSION for
> QCSID=<sid of SMON> and QCSERIAL# = <serial# of SMON> and see how many
> parallel server processes its using.
>
> If you kill (or have to restart the database) be aware that the startup of
> the database will "pause" while SMON cleans up that dead transaction before
> it opens the database (most likely) but SMON should do that in parallel and
> might be your best bet to get the database back to normal operating
> procedure. Though it might take a while for SMON to finish and thus delay
> the opening of the database which you can monitor by tailing the alert log
> file if you do kill & restart the database instance.
>
> Chris
>
>
>
>
>
> On Thu, May 6, 2021 at 1:14 AM Lok P <loknath.73_at_gmail.com> wrote:
>
>> Thank you . It seems to match our symptoms. (Big Rollback causing
>> enq-ss contention)
>>
>> We were thinking about killing the system processes which are trying to
>> perform the rollback as we don't need that job anymore. Is that safe?
>> Or we have to either wait to let that finish and the new transactions may
>> move on by increasing the pga_aggregate_target as suggested in the note.
>> else drop and create temp tablespace.
>>
>> On Thu, May 6, 2021 at 10:13 AM Chris Taylor <
>> christopherdtaylor1994_at_gmail.com> wrote:
>>
>>> I had to look up ENQ-SS on Oracle support. That's a sort segment
>>> contention, usually encountered when SMON is really busy cleaning up a dead
>>> / killed transaction.
>>>
>>> Oracle support has a few notes on this but the most applicable seem to
>>> stop at 11.1 .
>>>
>>> There is one note that says to try to increase PGA_AGGREGATE_TARGET
>>> SS Sort Segment Enqueue: 'enq: SS - contention' (Doc ID 2601825.1)
>>>
>>> Chris
>>>
>>> On Wed, May 5, 2021 at 11:56 PM Lok P <loknath.73_at_gmail.com> wrote:
>>>
>>>> Hi All, Need some help, we had killed one of the long running sessions
>>>> which was running since ~8hrs+ , but after we kill that we see a lot of
>>>> "wait for a undo record" wait events but we ignored that thinking that will
>>>> run for sometime as because it will do rollback. But suddenly now we are
>>>> seeing "enq-SS contention" wait event in addition to "wait for a undo
>>>> record" and that is blocking all other application queries. So wondering
>>>> how we should mitigate this issue?
>>>>
>>>> The version is 11.2.0.4 of Exadata.
>>>>
>>>> Regards
>>>> Lok
>>>>
>>>
-- http://www.freelists.org/webpage/oracle-lReceived on Thu May 06 2021 - 20:14:09 CEST