RE: Crazy dynamic sampling?

From: Dominic Brooks <dombrooks_at_hotmail.com>
Date: Tue, 17 Apr 2018 11:23:46 +0000
Message-ID: <DB6P190MB0501EAB9C3C7179B8E7D3A7CA1B70_at_DB6P190MB0501.EURP190.PROD.OUTLOOK.COM>



I stopped optimizer trace file after a few minutes minutes. It was executing a dynamic sampling query from 10:05:02 until I killed it at 10:08:13.

The recursive sampling sql is noticeable for having a large case statement and a large number of where clauses with 36 <> clauses and at least one ‘OR’. The <> clauses are all related to the subpartition columns

The query says “where subpartition_column_value = ‘ABC”.

Interestingly...maybe ... if I run the recursive sampling query, it takes ages in hard parse itself... if I turn optimizer_dynamic_sampling = 0, it runs in no time. Same with the main SQL.



From: Dominic Brooks <dombrooks_at_hotmail.com> Sent: Tuesday, April 17, 2018 11:24:35 AM To: Stefan Knecht
Cc: Mark W. Farnham; oracle-l-freelists
Subject: RE: Crazy dynamic sampling?

Dynamic sampling level is 4 at param level. There aren’t really any other significant non-default parameters but I can’t post this from work.



From: Stefan Knecht <knecht.stefan_at_gmail.com> Sent: Tuesday, April 17, 2018 11:03:00 AM To: Dominic Brooks
Cc: Mark W. Farnham; oracle-l-freelists
Subject: Re: Crazy dynamic sampling?

What's the dynamic sampling level used?

Could you show us the data in v$sql_optimizer_env for one of the problematic queries?

On Tue, Apr 17, 2018 at 4:49 PM, Dominic Brooks <dombrooks_at_hotmail.com<mailto:dombrooks_at_hotmail.com>> wrote: Yes – it’s not a parallel query.

When there are other sessions blocked – they are just different executions of the same query waiting for this first one to quit with the crazy hard parsing.



From: Mark W. Farnham <mwf_at_rsiz.com<mailto:mwf_at_rsiz.com>> Sent: Tuesday, April 17, 2018 10:45:56 AM To: dombrooks_at_hotmail.com<mailto:dombrooks_at_hotmail.com>; 'Stefan Knecht'; 'oracle-l-freelists'

Subject: RE: Crazy dynamic sampling?

After it finally finishes parsing, did the plan end up serial?

There is a specific bug where the children don’t get the same plan as the coordinator, and they try again and again until some limit or timeout is reached and then you get the one original plan running serially. (I was puzzled why the hash wasn’t just passed instead of re-parsing, but that’s another issue unexplored.)

Unfortunately all my notes on this were behind a secure area (can’t take pictures or forward the email, even if you could hack it you agreed not to try) and the customer throws all that away every 90 days and flatlines your PC hard drive to re-use and discards your host linux server files at the end of the project so I can’t look back and type you a specific number or combination of numbers.

Pantloads of cursor: pin S wait on X event were a feature of this bug. That was however 12.1.x. I’m not sure whether the patch to fix this or turning off adaptive plans or both were required to make the bad behavior go away. I don’t know if this can happen on 11.x.

Graham’s colleague Mihajlo Tekic probably has accurate notes on this because he helped solve it and had seen it before and Oracle was magically allowed to transport their notes wherever they liked. Twas brillig.

Possibly this behavior is already well-known, I just hadn’t seen it before.

mwf

From: oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org> [mailto:oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org>] On Behalf Of Dominic Brooks Sent: Tuesday, April 17, 2018 5:14 AM
To: Stefan Knecht; oracle-l-freelists
Subject: RE: Crazy dynamic sampling?

It’s not the subsequent mutexes that I’m bothered about. More about why the session at the head of the queue is hard parsing for minutes or hours.



From: Stefan Knecht <knecht.stefan_at_gmail.com<mailto:knecht.stefan_at_gmail.com>> Sent: Tuesday, April 17, 2018 10:09:35 AM To: Dominic Brooks; oracle-l-freelists
Subject: Re: Crazy dynamic sampling?

Not seen that on 11.2.0.4.

What mutex and location exactly are the sessions waiting on? You can hit x$mutex_sleep from the p2 in v$session for the cursor: pin S wait on X event. Some more details on how to do thatare here: WAITEVENT: "cursor: pin S wait on X" Reference Note (Doc ID 1298015.1)

Then I'd search MoS for bugs related to that mutex location.

On Tue, Apr 17, 2018 at 3:55 PM, Dominic Brooks <dombrooks_at_hotmail.com<mailto:dombrooks_at_hotmail.com>> wrote: Good question – sorry for missing that out.

11.2.0.4



From: Stefan Knecht <knecht.stefan_at_gmail.com<mailto:knecht.stefan_at_gmail.com>> Sent: Tuesday, April 17, 2018 9:54:57 AM To: Dominic Brooks; oracle-l-freelists
Subject: Re: Crazy dynamic sampling?

Are you on 12.1.0.2 ?

On Tue, Apr 17, 2018 at 3:49 PM, Dominic Brooks <dombrooks_at_hotmail.com<mailto:dombrooks_at_hotmail.com>> wrote:

In addition to invisible indexes not being so invisible, I hash partitioned some global PK indexes to reduce insert hot block contention.

These two things – invisible local indexes with some unusable subpartitions and hash partitioned PK indexes - are the only things which changed.

I’ve now got a few handfuls of statements which, in addition to some of them doing table expansion thanks to the not-so-invisible invisible indexes, are doing crazy long hard parsing – like 30 to 40 minutes. And that causes knock-on effects to other sessions in the same workflow for the same sql / objects with “cursor: pin S wait on X” and “library cache lock”

The likely culprit is dynamic sampling. I can see the sessions doing lots of IO against different table partitions as part of the hard parse.

Otherwise the health of the database is no less healthy than it was before and the SQL is the same SQL as it was before.

Whilst I wait for an optimizer trace file and some other dumps/traces, any wild theories as to why just hash partitioning a global PK might cause this?



From: Dominic Brooks <dombrooks_at_hotmail.com<mailto:dombrooks_at_hotmail.com>> Sent: Tuesday, April 10, 2018 11:09:03 AM To: oracle-l_at_freelists.org<mailto:oracle-l_at_freelists.org> Subject: Re: Invisible indexes and table expansion

Probably just covered by bug 16544878.

Sent from my iPhone

> On 10 Apr 2018, at 10:50, Dominic Brooks <dombrooks_at_hotmail.com<mailto:dombrooks_at_hotmail.com>> wrote:
>
> Just working through a problem, gathering evidence and trying to reproduce. And it’s difficult for me to get an optimiser trace file in the environment.
>
> In an 11.2.0.4 environment, I’ve got a whole bunch of queries now using table expansion - VW_TE_2.
>
> Recent change was to add three local indexes, each with a subset of partitions as usable - something which table expansion was designed to help with.
>
> But they are invisible indexes.
>
> So my theory is the invisibility is limited - ie they are visible enough to cause table expansion - but then can’t be used.
>
> Anyone looked into something similar?
>
> Cheers
> Dominic
>
> Sent from my iPhone

--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Visit us at zztat.net<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fzztat.net%2F&data=02%7C01%7C%7C091398c85ae648c000a508d5a440f1cc%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636595521191902854&sdata=6op%2BmoRIfXgUZH4LUYGneptseAyK6kBWLatE1nVWtco%3D&reserved=0> | _at_zztat_oracle | fb.me/zztat<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Ffb.me%2Fzztat&data=02%7C01%7C%7C091398c85ae648c000a508d5a440f1cc%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636595521191902854&sdata=dFGSO9jq4Zyd2ALOYh8%2FAR%2B1uYVpKgkuLEnY4zQsO24%3D&reserved=0> | zztat.net/blog/<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fzztat.net%2Fblog%2F&data=02%7C01%7C%7C091398c85ae648c000a508d5a440f1cc%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636595521191902854&sdata=2WrgIAbUdg87CtABPysLyHVrF6wxD4vEm2w5%2BSMawpg%3D&reserved=0>



--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Visit us at zztat.net<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fzztat.net%2F&data=02%7C01%7C%7Cea4467b67ff043211c7708d5a442fd23%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636595529972057862&sdata=cLWgDecacRkSu0Yae1%2BXLu%2BQuGNFTblL6waQ3R3R9Ec%3D&reserved=0> | _at_zztat_oracle | fb.me/zztat<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Ffb.me%2Fzztat&data=02%7C01%7C%7Cea4467b67ff043211c7708d5a442fd23%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636595529972057862&sdata=8IdqBT9NYqjMolvuIwr%2FGchUMk8KPtB2IuQ8MCe8AJo%3D&reserved=0> | zztat.net/blog/<https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fzztat.net%2Fblog%2F&data=02%7C01%7C%7Cea4467b67ff043211c7708d5a442fd23%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636595529972057862&sdata=Wk4UCjTI5iu0t6s%2BgvZC%2F9wudcSRAdfDx8zHHVaz8KA%3D&reserved=0>



--
//
zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
Visit us at zztat.net<https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fzztat.net%2F&data=02%7C01%7C%7C444f16bbbfcc430d269308d5a44a733e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636595562018048505&sdata=AnB0O8U2P%2FT4%2FP4xGvr%2BIQ4Bqi3qqQtclVuo2Xcfc5Y%3D&reserved=0> | _at_zztat_oracle | fb.me/zztat<https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Ffb.me%2Fzztat&data=02%7C01%7C%7C444f16bbbfcc430d269308d5a44a733e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636595562018048505&sdata=9IB20JqVnRI6yJ0RUHMJN5wtQfd7foOAL4DUjVWHRLc%3D&reserved=0> | zztat.net/blog/<https://nam03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fzztat.net%2Fblog%2F&data=02%7C01%7C%7C444f16bbbfcc430d269308d5a44a733e%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636595562018048505&sdata=X5uPVL%2Fil%2B27eBoCDXUbap9hoDX1qqkaJF3amjlO9vY%3D&reserved=0>


--
http://www.freelists.org/webpage/oracle-l
Received on Tue Apr 17 2018 - 13:23:46 CEST

Original text of this message