Re: Interesting performance challege (persistent child cursor mis-match)

From: Frits Hoogland <frits.hoogland_at_gmail.com>
Date: Sun, 3 May 2020 00:48:32 +0200
Message-Id: <0EC7A210-151A-46E0-BE99-57B72F30EB4C_at_gmail.com>



And I apologise for not getting you an answer, but trying to help in laying out the fundamental structure in a logical way.

A (parent) cursor is identified by the hash value for oracle internally. The hash value is obtained from the sqltext. If you change anything in the sqltext that is parsed, it will lead to a different hash value, and thus another parent. Another parent means a totally different parent object, although the SQL might be doing exactly the same.

If you add in the outline hints, I am confident you get a different hash value, and thus a different parent.

Another representation of the parent besides the hash value is the SQL_ID.

Now in order to actually do something, Oracle needs to create something that details the specifics of the parsing client and create an executable form of the SQL so it can be used in the execution phase. The whole of specifics and an executable form of the SQL, is what is the a child. A plan hash value is a hash value of the executable form of the SQL, so an aid in identifying a child, because the executable form of the SQL is what is the plan.

If you add the outline hints, you get a completely different SQL text, and thus a new parent, and a new child. Obviously, what you expect is that the outline hints of the good plan will lead to the good plan again. I don’t know why that doesn’t happen.

Wouldn’t it be an idea to harmonise the language settings of all the clients so it can’t reject clients from a good plan? I would urge to do the test of the client language settings and see if you can replay getting different children for different language settings. If so, harmonising the language settings should in all cases prevent at least some cases of multiple children with execution plans.

Could it be the case the plan you want is simply not doable for oracle for a certain language (for whatever reason).

But again, if you make sure the language settings are the same, you prevent from current issues and future issues regarding flipping plans because of language differences.

Frits Hoogland

http://fritshoogland.wordpress.com <http://fritshoogland.wordpress.com/> frits.hoogland_at_gmail.com <mailto:frits.hoogland_at_gmail.com> Mobile: +31 6 14180860

> On 3 May 2020, at 00:11, Chris Taylor <christopherdtaylor1994_at_gmail.com> wrote:
>
> I apologize for not being clear.
>
> The PHVs are children of cursor 0 -- I don't know what the correct term is here.
>
> So Cursor #0 (PHV #1) is generated by a client using UTF-8.
> Cursor #1 (PHV #2) is generated by a client using WE8MSWIN1252
>
> The reason listed as the mismatch was LANGUAGE_MISMATCH.
>
> I found an Oracle Doc with NLS_LANG differences that the cursors are not shareable.
>
> However, now the water is even muddier.
>
> Oracle still wasn't using the correct PHV even when I had one client connected and flushing the sql_id from all the nodes.
>
> So, I completely disabled the sql_profile and the baselines and purged the SQL_ID (again). Now Oracle is using the correct PHV on its own.
> If I take the OUTLINE hint from the display cursor for the SQL_ID (& the good PHV) and execute it in a 10046 trace , I get the POOR plan using the SAME outline data from the good plan.
>
> I have no idea what's up with that.
>
> Here's the 2 PHVs of interest and their timings:
>
> PLAN_HASH_VALUE EXECUTIONS_TOTAL AVG_ET_SECS
> --------------- ---------------- -----------
> 3208801841 1.2318E+10 .00021384 ** This is the good one
> 2040784329 103444982 .01711316 -- This is a child cursor of the above with a different exec plan
>
> This is the OUTLINE hint from the good phv and if I pop that hint into the SQL, it gives me the bad plan above. I don't get it.
>
> Again, Oracle is now picking the correct plan with no profile and no baselines but will NOT use it when a profile exists for it .
>
> It's like something is "wrong" with the extracted outline data for that PHV (and thus the sql_profile is garbage as well as that info is stored in the profile)
>
> Outline Data
> -------------
>
> /*+
> BEGIN_OUTLINE_DATA
> IGNORE_OPTIM_EMBEDDED_HINTS
> OPTIMIZER_FEATURES_ENABLE('12.1.0.2')
> DB_VERSION('12.1.0.2')
> OPT_PARAM('optimizer_dynamic_sampling' 0)
> OPT_PARAM('_optimizer_use_feedback' 'false')
> OPT_PARAM('_px_adaptive_dist_method' 'off')
> OPT_PARAM('_optimizer_strans_adaptive_pruning' 'false')
> OPT_PARAM('_optimizer_gather_feedback' 'false')
> OPT_PARAM('_optimizer_nlj_hj_adaptive_join' 'false')
> OPT_PARAM('_fix_control' '20243268:1 16732417:1 26664361:7')
> ALL_ROWS
> OUTLINE_LEAF(_at_"SEL$1DABF674")
> PUSH_PRED(_at_"SEL$C1A0C3CD" "VW_LAT_8ED7E3D6"_at_"SEL$8ED7E3D6" 28)
> OUTLINE_LEAF(_at_"SEL$2E6E727F")
> PUSH_PRED(_at_"SEL$C1A0C3CD" "ERX_PRESCRIBER"_at_"SEL$9" 11 10 9)
> OUTLINE_LEAF(_at_"SEL$C1A0C3CD")
> DECORRELATE(_at_"SEL$98BE7483")
> OUTLINE_LEAF(_at_"SEL$13")
> OUTLINE_LEAF(_at_"SEL$DD6B2BE9")
> MERGE(_at_"SEL$14")
> OUTLINE(_at_"SEL$98BE7483")
> DECORRELATE(_at_"SEL$CC45FCA3")
> OUTLINE(_at_"SEL$C1A0C3CD")
> DECORRELATE(_at_"SEL$98BE7483")
> OUTLINE(_at_"SEL$CCA5E655")
> MERGE(_at_"SEL$10")
> OUTLINE(_at_"SEL$790019EA")
> MERGE(_at_"SEL$4FE9DA99")
> OUTLINE(_at_"SEL$15")
> OUTLINE(_at_"SEL$14")
> OUTLINE(_at_"SEL$CC45FCA3")
> OUTLINE(_at_"SEL$11")
> OUTLINE(_at_"SEL$10")
> OUTLINE(_at_"SEL$45011088")
> ANSI_REARCH(_at_"SEL$12")
> OUTLINE(_at_"SEL$4FE9DA99")
> MERGE(_at_"SEL$0C03B516")
> OUTLINE(_at_"SEL$8ED7E3D6")
> OUTLINE(_at_"SEL$12")
> OUTLINE(_at_"SEL$BE1345CD")
> ANSI_REARCH(_at_"SEL$3C7AEB86")
> OUTLINE(_at_"SEL$0C03B516")
> MERGE(_at_"SEL$FCF0FA70")
> OUTLINE(_at_"SEL$3")
> OUTLINE(_at_"SEL$3C7AEB86")
> ANSI_REARCH(_at_"SEL$9")
> OUTLINE(_at_"SEL$5ADDCBD7")
> ANSI_REARCH(_at_"SEL$123A543A")
> OUTLINE(_at_"SEL$FCF0FA70")
> MERGE(_at_"SEL$130B1C8C")
> OUTLINE(_at_"SEL$9")
> OUTLINE(_at_"SEL$123A543A")
> ANSI_REARCH(_at_"SEL$8")
> OUTLINE(_at_"SEL$6CECFF5D")
> ANSI_REARCH(_at_"SEL$BBA4C8A8")
> OUTLINE(_at_"SEL$130B1C8C")
> MERGE(_at_"SEL$D7512574")
> OUTLINE(_at_"SEL$8")
> OUTLINE(_at_"SEL$BBA4C8A8")
> ANSI_REARCH(_at_"SEL$7")
> OUTLINE(_at_"SEL$6C34F147")
> ANSI_REARCH(_at_"SEL$0DBA457E")
> OUTLINE(_at_"SEL$D7512574")
> MERGE(_at_"SEL$0B841A50")
> OUTLINE(_at_"SEL$7")
> OUTLINE(_at_"SEL$0DBA457E")
> ANSI_REARCH(_at_"SEL$6")
> OUTLINE(_at_"SEL$F68A4B96")
> ANSI_REARCH(_at_"SEL$7D6EB0FC")
> OUTLINE(_at_"SEL$0B841A50")
> MERGE(_at_"SEL$82FA1C20")
> OUTLINE(_at_"SEL$6")
> OUTLINE(_at_"SEL$7D6EB0FC")
> ANSI_REARCH(_at_"SEL$5")
> OUTLINE(_at_"SEL$D38AA093")
> ANSI_REARCH(_at_"SEL$88675C8F")
> OUTLINE(_at_"SEL$82FA1C20")
> MERGE(_at_"SEL$58A6D7F6")
> OUTLINE(_at_"SEL$5")
> OUTLINE(_at_"SEL$88675C8F")
> ANSI_REARCH(_at_"SEL$4")
> OUTLINE(_at_"SEL$0D1A3F46")
> ANSI_REARCH(_at_"SEL$8ED7E3D6")
> OUTLINE(_at_"SEL$58A6D7F6")
> MERGE(_at_"SEL$1")
> OUTLINE(_at_"SEL$4")
> OUTLINE(_at_"SEL$2")
> OUTLINE(_at_"SEL$1")
> NO_ACCESS(_at_"SEL$DD6B2BE9" "from$_subquery$_024"_at_"SEL$14")
> NO_ACCESS(_at_"SEL$13" "from$_subquery$_023"_at_"SEL$13")
> INDEX_RS_ASC(_at_"SEL$C1A0C3CD" "CE"_at_"SEL$2" ("COVERED_ENTITY"."COVERED_ENTITY_ID"))
> INDEX(_at_"SEL$C1A0C3CD" "PM"_at_"SEL$1" ("PERSON_MAP"."PERSON_ID" "PERSON_MAP"."HOSPITAL_ID" "PERSON_MAP"."ID" "PERSON_MAP"."TABLE_OID"))
> INDEX_RS_ASC(_at_"SEL$C1A0C3CD" "E"_at_"SEL$1" ("EVENT"."ID" "EVENT"."TABLE_OID" "EVENT"."ELIGIBLE" "EVENT"."EFFECTIVE_ON"))
> BATCH_TABLE_ACCESS_BY_ROWID(_at_"SEL$C1A0C3CD" "E"_at_"SEL$1")
> INDEX(_at_"SEL$C1A0C3CD" "ETE"_at_"SEL$4" "EVENT_TO_EPISODE_PKX")
> INDEX(_at_"SEL$C1A0C3CD" "EOC"_at_"SEL$5" ("EPISODE_OF_CARE"."EPISODE_OF_CARE_ID" "EPISODE_OF_CARE"."HOSPITAL_ID"))
> INDEX(_at_"SEL$C1A0C3CD" "EOCP"_at_"SEL$6" ("EPISODE_OF_CARE_PHYSICIAN"."EPISODE_OF_CARE_ID" "EPISODE_OF_CARE_PHYSICIAN"."PDB_PHYSICIAN_LOOKUP_ID"))
> INDEX_RS_ASC(_at_"SEL$C1A0C3CD" "PL"_at_"SEL$7" ("PHYSICIAN_LOOKUP"."PHYSICIAN_LOOKUP_ID"))
> NO_ACCESS(_at_"SEL$C1A0C3CD" "VW_LAT_8ED7E3D6"_at_"SEL$8ED7E3D6")
> INDEX(_at_"SEL$C1A0C3CD" "PACM"_at_"SEL$8" "PHYSICIAN_APPROVED_CE2_IDX1")
> NO_ACCESS(_at_"SEL$C1A0C3CD" "ERX_PRESCRIBER"_at_"SEL$9")
> LEADING(_at_"SEL$C1A0C3CD" "CE"_at_"SEL$2" "PM"@"SEL$1" "E"@"SEL$1" "ETE"@"SEL$4" "EOC"@"SEL$5" "EOCP"@"SEL$6" "PL"@"SEL$7"
> "VW_LAT_8ED7E3D6"_at_"SEL$8ED7E3D6" "PACM"_at_"SEL$8" "ERX_PRESCRIBER"@"SEL$9")
> USE_NL(_at_"SEL$C1A0C3CD" "PM"_at_"SEL$1")
> USE_NL(_at_"SEL$C1A0C3CD" "E"_at_"SEL$1")
> USE_NL(_at_"SEL$C1A0C3CD" "ETE"_at_"SEL$4")
> USE_NL(_at_"SEL$C1A0C3CD" "EOC"_at_"SEL$5")
> USE_NL(_at_"SEL$C1A0C3CD" "EOCP"_at_"SEL$6")
> USE_NL(_at_"SEL$C1A0C3CD" "PL"_at_"SEL$7")
> USE_NL(_at_"SEL$C1A0C3CD" "VW_LAT_8ED7E3D6"_at_"SEL$8ED7E3D6")
> USE_NL(_at_"SEL$C1A0C3CD" "PACM"_at_"SEL$8")
> USE_NL(_at_"SEL$C1A0C3CD" "ERX_PRESCRIBER"_at_"SEL$9")
> USE_HASH_AGGREGATION(_at_"SEL$C1A0C3CD")
> INDEX_RS_ASC(_at_"SEL$1DABF674" "ERX"_at_"SEL$3" ("E_PRESCRIPTION"."E_PRESCRIPTION_ID"))
> INDEX_RS_ASC(_at_"SEL$2E6E727F" "PDB_MAP"_at_"SEL$10" ("PDB_MAP"."ID" "PDB_MAP"."TABLE_OID" "PDB_MAP"."PDB_PHYSICIAN_LOOKUP_ID"))
> BATCH_TABLE_ACCESS_BY_ROWID(_at_"SEL$2E6E727F" "PDB_MAP"_at_"SEL$10")
> INDEX(_at_"SEL$2E6E727F" "MAP_PL"_at_"SEL$10" ("PHYSICIAN_LOOKUP"."PHYSICIAN_LOOKUP_ID"))
> LEADING(_at_"SEL$2E6E727F" "PDB_MAP"_at_"SEL$10" "MAP_PL"@"SEL$10")
> USE_NL(_at_"SEL$2E6E727F" "MAP_PL"_at_"SEL$10")
> NLJ_BATCHING(_at_"SEL$2E6E727F" "MAP_PL"_at_"SEL$10")
> END_OUTLINE_DATA
> */
>
>
>
>
>
>
>
>
>
>
> On Sat, May 2, 2020 at 5:25 PM Frits Hoogland <frits.hoogland_at_gmail.com <mailto:frits.hoogland_at_gmail.com>> wrote:
> Hi Chris, sorry if my response sounds like nitpicking, but I try to understand what you are saying.
>
> The query text gets hashed and based on the hash it gets a parent handle allocated.
> Any query with the same text will end up at this parent handle.
> SQL cursor parent handles do not execute, a child cursor is needed to execute, and has an execution plan with it.
>
> You said:

>> Secondary connections to this node by clients with WE8MSWIN1252 are getting a child cursor with "LANGUAGE_MISMATCH".
>> 
>> Normally this SQL_ID only ever has 2 plans and no child cursors but due to the time of day we were working and batch jobs running, we had multiple servers parsing this SQL_ID after purging it.  (nightly jobs)

>
> A SQL_ID is another representation of the hash value (I believe Tanel Poder found this).
> I do not understand how a parent cursor / hash value / SQL_ID has 2 plans and no child cursors.
> The only way for identical SQL, thus a parent handle, thus a SQL_ID to execute alias do something is when it has a child handle (which is what I assume you mean with ‘child cursor’).
> The only way for identical SQL to have 2 plans is to have 2 children.
> Of course a parent can have a child at one time, which gets purged and then later has another child with another execution plan.
>
> The language for an oracle client might be picked up from the OS language setting, so if you got different operating systems the languages might be different. But that is an hypothesis.
>
> You could validate your language assumption by connecting from the different servers using an interactive client tool and execute ’select * from nls_session_parameters’ to see the settings, and then let them execute something identical and see if they get their own children.
>
> The sharing criteria from v$sql_shared_cursor are properties of a child cursor to prevent from using a child cursor that could lead to differences in any form.
> One “trick” I use to get another child is to set a session parameter to a different value (I use db_file_multiblock_read_count for example).
> The way it works is it looks up the parent handle, then obtains the child (pointer) list in the parent’s heap 0, and scans the child list from the most recent child to the oldest.
> If it finds a suitable/compatible child, it will use it, if it doesn’t, it will start creating a new one.
>
> For every child it’s investigating, it looks at all the properties that could mean the child is not the most optimal child to use. That is what v$sql_shared_cursor shows.
>
>
>
> Frits Hoogland
>
> http://fritshoogland.wordpress.com <http://fritshoogland.wordpress.com/>
> frits.hoogland_at_gmail.com <mailto:frits.hoogland_at_gmail.com>
> Mobile: +31 6 14180860
>
>> On 2 May 2020, at 19:12, Chris Taylor <christopherdtaylor1994_at_gmail.com <mailto:christopherdtaylor1994_at_gmail.com>> wrote:
>> 
>> Env:
>> 12.1.0.2 x86-64 
>> Exadata / OS: Oracle Linux  / 4-Node RAC
>> Latest Patches
>> 
>> Background:
>> This week we added an index for a very important ETL process to drive execution time down.
>> Subsequently we had a performance change in data ingestion process (100k records per hour) for a SQL_ID.
>> 
>> Ultimately, we put in 2 SQL Profiles and 2 SQL Baselines as the plan for the query will rotate between 2 acceptable plans and varies by day as there is a nightly job that rebuilds a temp table and repoints a synonym.  (PL/SQL now exists to rotate the 2 SQL profiles based on the synonym).
>> 
>> The INTERESTING part is that no matter what, we were still getting suboptimal child plans/cursors that were driving me a bit bananas.  Even after flushing/invalidating all the plans for that particular SQL id.
>> 
>> This morning, I wanted to take a fresh look at it and approach it from a cursor level and understand what was different. 
>> 
>> Which led me to GV$SQL_SHARED_CURSOR and mismatch information. (Which I was familiar with already but due to the "fire" of this performance problem hadn't had time to adequately research).
>> 
>> The MISMATCH for this cursor was on LANGUAGE_MISMATCH.  I was like, "Hey, that's very interesting.  Why is that?"
>> 
>> Turns out we have 2 separate (or more) client servers that go through the same code, and execute the same SQL.
>> 
>> The primary connecting server doesn't set NLS_LANG or OS LANG variable at all.  And those sessions come in as WE8MSWIN1252.
>> 
>> The other connecting servers have LANG=en_US.UTF-8
>> 
>> In this particular case the SQL_ID was being parsed first by clients connecting with the en_US.UTF-8 variable set.
>> 
>> Secondary connections to this node by clients with WE8MSWIN1252 are getting a child cursor with "LANGUAGE_MISMATCH".
>> 
>> Normally this SQL_ID only ever has 2 plans and no child cursors but due to the time of day we were working and batch jobs running, we had multiple servers parsing this SQL_ID after purging it.  (nightly jobs)
>> 
>> I theorize that UTF-8 plans are unsharable with clients who connect as WE8MSWIN1252 .... but SQLs that are first parsed by WE8MSWIN1252 connected sessions can be re-used by clients that come in using UTF-8 .
>> 
>> Does that sound like a reasonable theory?
>> 
>> I'm currently waiting for for the last 3 remaining batch jobs to finish their work so that I can flush the SQL_ID again and I expect that will return us to our normal behavior of only seeing 1 of 2 plans every day.
>> 
>> (Now, if this particular pot would hurry up and boil, I'd be able to tell for sure....)
>> 
>> Chris
>> 
>> 
>> 

>
--
http://www.freelists.org/webpage/oracle-l
Received on Sun May 03 2020 - 00:48:32 CEST

Original text of this message