Re: kkslce [KKSCHLPIN2] mutex contention on hard parse of PQ with binds

From: Stefan Knecht <knecht.stefan_at_gmail.com>
Date: Tue, 15 Sep 2015 18:29:49 +0700
Message-ID: <CAP50yQ8hKKpqcUgJtSDNi_HUcxi1aX4HMrxD0pv-bW5qtC-Bxw_at_mail.gmail.com>



I can reproduce it at will in sql*plus.

I am not seeing any rows in v$pq_tqstat after the query runs.

We're seeing many queries that suffer from the same issue - however the one I used to build a test case had the most significant waits and is the most simple of them. This is the query:

SELECT AV.ATTRIB_VALUE FROM RETVEN_ATTRIB_VALUE AV INNER JOIN ATTRIB_DIM_OWNER ADO ON AV.ATTRIB_DIM_OWNER_KEY = ADO.ATTRIB_DIM_OWNER_KEY INNER JOIN ATTRIB AX ON ADO.ATTRIB_KEY = AX.ATTRIB_KEY WHERE AV.RETAILER_VENDOR_KEY = :B2 AND AX.NAME = UPPER(:B1 ); I added hints to ensure it runs in parallel (full and parallel hints).

On Tue, Sep 15, 2015 at 5:25 PM, Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk
> wrote:

>
>
> What's the environment from which you are calling the SQL ? Is it a basic
> SQL*Plus session, or some type of front-end that might be causing the
> double parse with binds anyway ? I forget which mid-tier, or driver it was
> but there used to be common questions about seeing statement with bind
> variables being parsed twice, once with no values for the bind variable,
> then a second time because the values were only sent on a second call.
>
> Can you find any consistency in the queries which show this behaviour
> (apart from the partitioning) ? If you check the v$pq_tqstats view after
> running the queries do you see a "ranger" operation as the first operation
> ? (This might be visible as a S->P distribution into TQ10000 if all you can
> get at is execution plans).
>
> Have you tried adding hints to any of the queries that persistently show
> this problem simply to check if the restriction on the optimization phase
> bypasses the problem ?
>
>
>
>
> Regards
> Jonathan Lewis
> http://jonathanlewis.wordpress.com
> _at_jloracle
> ------------------------------
> *From:* Stefan Knecht [knecht.stefan_at_gmail.com]
> *Sent:* 15 September 2015 10:50
> *To:* mohamed.houri_at_gmail.com
> *Cc:* Jonathan Lewis; oracle-l-freelists
> *Subject:* Re: kkslce [KKSCHLPIN2] mutex contention on hard parse of PQ
> with binds
>
> So the key difference is that it's creating one more child, if we're using
> bind variables. Whereas the query with literals only has child 0.
>
> And that induces an overhead of anywhere between 1 and 3 seconds. Wow.
>
> On Tue, Sep 15, 2015 at 4:48 PM, Stefan Knecht <knecht.stefan_at_gmail.com>
> wrote:
>
>> Here's another test, without binds:
>>
>> SQL> select sql_id, child_number, executions, invalidations, parse_calls
>> from v$sql where sql_id='avgm0shx7hnmv';
>>
>> SQL_ID CHILD_NUMBER EXECUTIONS INVALIDATIONS PARSE_CALLS
>> ------------- ------------ ---------- ------------- -----------
>> avgm0shx7hnmv 0 1 0 9
>>
>>
>> We still see several parse calls. But zero "cursor: pin S wait on X"
>> waits in any of the tracefiles.
>>
>>
>>
>> On Tue, Sep 15, 2015 at 4:38 PM, Stefan Knecht <knecht.stefan_at_gmail.com>
>> wrote:
>>
>>> Mohamed
>>>
>>> This is precisely happening during a hard parse only. There's nothing to
>>> share - except between the PX slaves. The query is deliberately being hard
>>> parsed, and a new cursor is created hence v$sql_shared_cursor, etc won't be
>>> helpful in this case.
>>>
>>> Stefan
>>>
>>>
>>> On Tue, Sep 15, 2015 at 4:33 PM, Mohamed Houri <mohamed.houri_at_gmail.com>
>>> wrote:
>>>
>>>> *"cursor: pin S wait on X"* is a symptom of a session wanting a cursor
>>>> on *S*hared mode but another session is holding the same cursor in e
>>>> *X*lusive mode.
>>>>
>>>> Typically the cursor holding the eXclusive lock is hard parsing. It
>>>> might also be due to Extended Cursor Sharing kicking in and optimizing an
>>>> execution plan for each execution.This is why, as far as you are using bind
>>>> variables, I would have started by checking the reason for which your
>>>> cursor is not shared (using *nonshared.sql* script of Tanel Poder for
>>>> example)
>>>>
>>>> Best regards
>>>> Mohamed
>>>>
>>>> 2015-09-15 10:37 GMT+02:00 Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
>>>> :
>>>>
>>>>>
>>>>>
>>>>> Are the 4 out of 8 isolated to one of the two slave sets, or are they
>>>>> scattered randomly across the two slave sets. (I would suspect the former).
>>>>>
>>>>> It's probably expected behaviour (at least in some circumstances) but
>>>>> an extreme example of slow parse times. I have seen a couple of cases in
>>>>> 11g where PX slaves actually managed to get different execution plans for
>>>>> the same query at the same time because the slaves re-optimized the query -
>>>>> if this is possible, then the slaves could be in a position where
>>>>> they are attempting to re-optimize the query, but are still waiting for the
>>>>> QC to finish its parse call.
>>>>>
>>>>> Gathering related information:
>>>>>
>>>>> From v$sql - with one call to execute a new version of the statement,
>>>>> how many parse_calls does the parallel statement report.
>>>>> If you enable 10053 for a serial and a parallel version, does the
>>>>> parallel trace show much more work than the serial ? Does the 12c serial
>>>>> show much more work than the 11g serial?
>>>>> If you enable 10046 trace can you see the 4 PX slaves doing work that
>>>>> looks like re-optimization after they get the child mutex
>>>>>
>>>>> Speculative question
>>>>> Do the queries involve partitioned tables ?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Regards
>>>>> Jonathan Lewis
>>>>> http://jonathanlewis.wordpress.com
>>>>> _at_jloracle
>>>>> ------------------------------
>>>>> *From:* oracle-l-bounce_at_freelists.org [oracle-l-bounce_at_freelists.org]
>>>>> on behalf of Stefan Knecht [knecht.stefan_at_gmail.com]
>>>>> *Sent:* 15 September 2015 09:12
>>>>> *To:* oracle-l-freelists
>>>>> *Subject:* kkslce [KKSCHLPIN2] mutex contention on hard parse of PQ
>>>>> with binds
>>>>>
>>>>> Hi all
>>>>>
>>>>> Fairly odd event. After an upgrade to 12.1.0.2 we've been seeing very
>>>>> high "cursor: pin S wait on X" waits. But here's the catch - they only
>>>>> happen under very special (and seemingly odd) circumstances:
>>>>>
>>>>> - Query runs in parallel
>>>>> - Query uses bind variables
>>>>> - Query is hard parsed
>>>>>
>>>>> If we run the same query in serial, or with literal values, or we
>>>>> re-use a cursor, no waits are seen.
>>>>>
>>>>> The waits are huge (up to 0.3 seconds) and are seen on the PX slaves
>>>>> only. Say we're using 8 slaves, 4 of them will show these waits. 4 won't.
>>>>> And only if we're using binds. No binds, no waits.
>>>>>
>>>>> Does anyone have an explanation for what could be happening?
>>>>>
>>>>> I've already tried disabling some of the new fancy adaptive features,
>>>>> and most bind related features (peeking, px bind peeking, bind value
>>>>> captures, etc). Heck, it even happens with opt features set to 11.2.
>>>>>
>>>>> On a very similar database, that's still on 11.2.0.3, we're also
>>>>> seeing the same behavior. The key difference being that the waits are very
>>>>> short - less than 0.01 seconds. But they're still there, and I don't see
>>>>> why they would be in the first place. What causes Oracle to pin that mutex,
>>>>> if a bind variable is used in the query?
>>>>>
>>>>> A bug has been filed. So we'll see what comes out of that. It would
>>>>> seem odd that we'd be the first ones to hit this, it's not exactly exotic
>>>>> to run a parallel query with binds.
>>>>>
>>>>> Stefan
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Houri Mohamed
>>>>
>>>> Oracle DBA-Developer-Performance & Tuning
>>>>
>>>> Member of Oraworld-team <http://www.oraworld-team.com/>
>>>>
>>>> Visit My - Blog <http://www.hourim.wordpress.com/>
>>>>
>>>> Let's Connect - <http://fr.linkedin.com/pub/mohamed-houri/11/329/857/>*Linkedin
>>>> Profile <http://fr.linkedin.com/pub/mohamed-houri/11/329/857/>*
>>>>
>>>> My Twitter <https://twitter.com/MohamedHouri> - MohamedHouri
>>>> <https://twitter.com/MohamedHouri>
>>>>
>>>>
>>>
>>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Sep 15 2015 - 13:29:49 CEST

Original text of this message