Re: Looking for ideas on blocked sessions updating partitioned table with CLOB (solved)

From: Tanel Poder <tanel_at_tanelpoder.com>
Date: Tue, 18 Sep 2018 11:06:17 -0400
Message-ID: <CAMHX9JLWaEi4oT9-CmtpVnWC-WJVzAuFYf5v8cwWtZFN7wsYhw_at_mail.gmail.com>



Good troubleshooting... And that is an important starting point - don't troubleshoot what users say they *think* is happening, but just ask how do they experience the problem and then follow the chain of measurable evidence from there and go where-ever it leads you to, even if it's way different from what the friendly DBA thinks that is happening. My usual sequence goes from high level to microscopic: V$SESSION + wait events -> V$SESSTAT for extra hints & evidence -> strace / pstack. And can't even always trust Oracle's own instrumentation (sometimes it's wrong or missing), so taking the screwdriver and opening up the process up from outside (pstack, strace, dtrace/perf probe) is sometimes needed.

Btw, I wrote the ash_wait_chains scripts exactly for the "follow the chain of evidence" reasons. Have a script automatically walk through the blocker/waiter tree and report what everyone on a chain is doing, so you'd see the complex wait topology & what the ultimate blocker is doing sooner... :-)

For anyone interested, here's a "buffer busy wait" chain example: -
https://blog.tanelpoder.com/2013/11/06/diagnosing-buffer-busy-waits-with-the-ash_wait_chains-sql-script-v0-2/

And I just published videos of my ashtop.sql & ash_wait_chains.sql hacking sessions (close to 2 hours of free-form hacking :-) - https://www.youtube.com/tanelpoder

But, back to the original topic - I *still *sometimes get distracted and end up researching that fun exotic latch or mutex wait event that someone tells is the problem in their DB, only to then realize that it only took 4% of total response time and totally disappeared after they fixed the real problem of excessive CPU usage and oversubscription due to a SQL plan gone bad. I think I should hang out with Cary Millsap more!

Tanel.

On Tue, Sep 18, 2018 at 10:24 AM Chris Taylor < christopherdtaylor1994_at_gmail.com> wrote:

> So, I need to own up to something.
>
> My whole approach to this problem was wrong. I had been told that the
> problem with this table was the CLOB and the number of sessions updating
> the table at the same time. So, my whole approach was based on that
> information being TRUE.
>
> I was blinded to the 'real' problem and wasn't actively looking for an
> alternative answer until I posted this thread. Due to the discussion here,
> I took a few steps back and actually asked myself "why would this be row
> locks/blocks?" and WHY don't I see internal locking on the clob if the clob
> is the problem?
>
> At that point I was in 'question everything' mode and started really
> looking at the Event that the blocking session was sitting in (instead of
> just assuming it was a problem with design).
>
>>
>>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Sep 18 2018 - 17:06:17 CEST

Original text of this message