Re: Sudden occurrence of SHARED_POOL LATCH waits - DB up since 4/3/2018

From: Chris Taylor <christopherdtaylor1994_at_gmail.com>
Date: Thu, 31 May 2018 13:41:01 -0500
Message-ID: <CAP79kiQFd=S1TJ8ALbtpSbf8LDu-i1fVMU6QQFBJYEzBXMAcSQ_at_mail.gmail.com>



Nah, it DOES seem exceedingly large but you should see the transaction rates in this monster.
And, it probably is too large but haven't had time to dig into seeing what's a real more reasonable amount without negatively impacting customers.

Chris

On Thu, May 31, 2018 at 12:39 PM, Dominic Brooks <dombrooks_at_hotmail.com> wrote:

> Am I out of touch or is that an exceedingly large shared pool?
>
> Sent from my iPhone
>
> On 31 May 2018, at 18:05, Chris Taylor <christopherdtaylor1994_at_gmail.com>
> wrote:
>
> version 12.1
> Linux x86-64
>
> Event Waits Total Wait Time (sec) Wait Avg(ms) % DB time Wait Class
> latch: shared pool 402,269 103.4K 256.95 29.5 Concurrency
> DB CPU 100.3K 28.6
> latch free 362,927 42.2K 116.26 12.0 Other
> Background:
> We've got a database with 4TB of RAM and a 260GB shared pool. Everything
> was running normally until 2 nights ago when a regularly scheduled job that
> does TONS of DDL suddenly blew up the shared pool.
>
> Trying to query x$ksmsp for shared pool sizes is also causing huge amounts
> of latch waits.
>
> When these jobs aren't running, the db is working normally. Starting just
> one of the jobs brings back the shared pool latching problem immediately.
>
> I'm thinking the shared pool is heavily fragmented but am looking for
> additional queries or views I can use to diagnose shared pool problems.
> We're not running into any shared pool errors as far as memory allocations.
>
> This query hangs and causes huge contention:
>
> select ksmchcls "ChnkClass",
> sum(ksmchsiz) "SumChunkTypeMem",
> Max(ksmchsiz) "LargstChkofThisTyp",
> count(1) "NumOfChksThisType",
> round((sum(ksmchsiz)/tot_sp_mem.totspmem),2)*100||'%' "PctTotSPMem"
> from x$ksmsp,
> (select sum(ksmchsiz) TotSPMem from x$ksmsp) tot_sp_mem
> group by ksmchcls, tot_sp_mem.TotSPMem
> order by 5;
>
>
> This query runs fine and seems to indicate there is plenty of reserve
> space:
>
> select free_space, round(avg_free_size,2) as avg_free_size,free_count,max_free_size,used_space,
> round(avg_used_size,2) as avg_used_size, used_count, max_used_size,
> requests, request_misses, last_miss_size,
> request_failures, last_failure_size, aborted_request_threshold,
> aborted_requests
> from v$shared_pool_reserved
> /
>
> FREE_SPACE AVG_FREE_SIZE FREE_COUNT MAX_FREE_SIZE USED_SPACE AVG_USED_SIZE
> USED_COUNT MAX_USED_SIZE REQUESTS REQUEST_MISSES LAST_MISS_SIZE
> REQUEST_FAILURES LAST_FAILURE_SIZE ABORTED_REQUEST_THRESHOLD
> ABORTED_REQUESTS
> 13435398168 236538.7 16023 27316040 319665968 5627.92 40777 1048552
> 10903867 0 0 0 0 2147483647 0
>
> The Jobs in question do PARTITION EXCHANGES for many, many sites with temp
> tables that are built as part of the job.
>
> The DB has been up since: 04/03/2018 07:12:23 and the last 2 days are the
> first time this has been seen and is repeatable by restarting any one of
> the jobs (there are 8 total)
>
> Thinking about flushing the shared pool (or restarting db off hours) as a
> test to see if it resolves the issue (like throwing a grenade into a
> anthill resolves a problem with ants...not the best approach but might be
> effective)
>
> Any thoughts/comments/insults?
>
> Thanks,
> Chris
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Thu May 31 2018 - 20:41:01 CEST

Original text of this message