Re: Interesting performance challege (persistent child cursor mis-match)
Date: Sat, 2 May 2020 18:11:07 -0400
Message-ID: <CAP79kiRSSJpm7cRcP1ujRK9PwxcYyYA1Yqfh+HD9dsYOX4E39w_at_mail.gmail.com>
I apologize for not being clear.
The PHVs are children of cursor 0 -- I don't know what the correct term is here.
So Cursor #0 (PHV #1) is generated by a client using UTF-8.
Cursor #1 (PHV #2) is generated by a client using WE8MSWIN1252
The reason listed as the mismatch was LANGUAGE_MISMATCH.
I found an Oracle Doc with NLS_LANG differences that the cursors are not shareable.
However, now the water is even muddier.
Oracle still wasn't using the correct PHV even when I had one client connected and flushing the sql_id from all the nodes.
So, I completely disabled the sql_profile and the baselines and purged the SQL_ID (again). Now Oracle is using the correct PHV on its own. If I take the OUTLINE hint from the display cursor for the SQL_ID (& the good PHV) and execute it in a 10046 trace , I get the POOR plan using the SAME outline data from the good plan.
I have no idea what's up with that.
Here's the 2 PHVs of interest and their timings:
PLAN_HASH_VALUE EXECUTIONS_TOTAL AVG_ET_SECS
--------------- ---------------- ----------- 3208801841 1.2318E+10 .00021384 ** This is the good one 2040784329 103444982 .01711316 -- This is a child cursor ofthe above with a different exec plan
This is the OUTLINE hint from the good phv and if I pop that hint into the SQL, it gives me the bad plan above. I don't get it.
Again, Oracle is now picking the correct plan with no profile and no baselines but will NOT use it when a profile exists for it .
It's like something is "wrong" with the extracted outline data for that PHV (and thus the sql_profile is garbage as well as that info is stored in the profile)
Outline Data
/*+
BEGIN_OUTLINE_DATA IGNORE_OPTIM_EMBEDDED_HINTS OPTIMIZER_FEATURES_ENABLE('12.1.0.2') DB_VERSION('12.1.0.2') OPT_PARAM('optimizer_dynamic_sampling' 0) OPT_PARAM('_optimizer_use_feedback' 'false') OPT_PARAM('_px_adaptive_dist_method' 'off') OPT_PARAM('_optimizer_strans_adaptive_pruning' 'false') OPT_PARAM('_optimizer_gather_feedback' 'false') OPT_PARAM('_optimizer_nlj_hj_adaptive_join' 'false') OPT_PARAM('_fix_control' '20243268:1 16732417:1 26664361:7') ALL_ROWS OUTLINE_LEAF(_at_"SEL$1DABF674") PUSH_PRED(_at_"SEL$C1A0C3CD" "VW_LAT_8ED7E3D6"_at_"SEL$8ED7E3D6" 28) OUTLINE_LEAF(_at_"SEL$2E6E727F") PUSH_PRED(_at_"SEL$C1A0C3CD" "ERX_PRESCRIBER"_at_"SEL$9" 11 10 9) OUTLINE_LEAF(_at_"SEL$C1A0C3CD") DECORRELATE(_at_"SEL$98BE7483") OUTLINE_LEAF(_at_"SEL$13") OUTLINE_LEAF(_at_"SEL$DD6B2BE9") MERGE(_at_"SEL$14") OUTLINE(_at_"SEL$98BE7483") DECORRELATE(_at_"SEL$CC45FCA3") OUTLINE(_at_"SEL$C1A0C3CD") DECORRELATE(_at_"SEL$98BE7483") OUTLINE(_at_"SEL$CCA5E655") MERGE(_at_"SEL$10") OUTLINE(_at_"SEL$790019EA") MERGE(_at_"SEL$4FE9DA99") OUTLINE(_at_"SEL$15") OUTLINE(_at_"SEL$14") OUTLINE(_at_"SEL$CC45FCA3") OUTLINE(_at_"SEL$11") OUTLINE(_at_"SEL$10") OUTLINE(_at_"SEL$45011088") ANSI_REARCH(_at_"SEL$12") OUTLINE(_at_"SEL$4FE9DA99") MERGE(_at_"SEL$0C03B516") OUTLINE(_at_"SEL$8ED7E3D6") OUTLINE(_at_"SEL$12") OUTLINE(_at_"SEL$BE1345CD") ANSI_REARCH(_at_"SEL$3C7AEB86") OUTLINE(_at_"SEL$0C03B516") MERGE(_at_"SEL$FCF0FA70") OUTLINE(_at_"SEL$3") OUTLINE(_at_"SEL$3C7AEB86") ANSI_REARCH(_at_"SEL$9") OUTLINE(_at_"SEL$5ADDCBD7") ANSI_REARCH(_at_"SEL$123A543A") OUTLINE(_at_"SEL$FCF0FA70") MERGE(_at_"SEL$130B1C8C") OUTLINE(_at_"SEL$9") OUTLINE(_at_"SEL$123A543A") ANSI_REARCH(_at_"SEL$8") OUTLINE(_at_"SEL$6CECFF5D") ANSI_REARCH(_at_"SEL$BBA4C8A8") OUTLINE(_at_"SEL$130B1C8C") MERGE(_at_"SEL$D7512574") OUTLINE(_at_"SEL$8") OUTLINE(_at_"SEL$BBA4C8A8") ANSI_REARCH(_at_"SEL$7") OUTLINE(_at_"SEL$6C34F147") ANSI_REARCH(_at_"SEL$0DBA457E") OUTLINE(_at_"SEL$D7512574") MERGE(_at_"SEL$0B841A50") OUTLINE(_at_"SEL$7") OUTLINE(_at_"SEL$0DBA457E") ANSI_REARCH(_at_"SEL$6") OUTLINE(_at_"SEL$F68A4B96") ANSI_REARCH(_at_"SEL$7D6EB0FC") OUTLINE(_at_"SEL$0B841A50") MERGE(_at_"SEL$82FA1C20") OUTLINE(_at_"SEL$6") OUTLINE(_at_"SEL$7D6EB0FC") ANSI_REARCH(_at_"SEL$5") OUTLINE(_at_"SEL$D38AA093") ANSI_REARCH(_at_"SEL$88675C8F") OUTLINE(_at_"SEL$82FA1C20") MERGE(_at_"SEL$58A6D7F6") OUTLINE(_at_"SEL$5") OUTLINE(_at_"SEL$88675C8F") ANSI_REARCH(_at_"SEL$4") OUTLINE(_at_"SEL$0D1A3F46") ANSI_REARCH(_at_"SEL$8ED7E3D6") OUTLINE(_at_"SEL$58A6D7F6") MERGE(_at_"SEL$1") OUTLINE(_at_"SEL$4") OUTLINE(_at_"SEL$2") OUTLINE(_at_"SEL$1") NO_ACCESS(_at_"SEL$DD6B2BE9" "from$_subquery$_024"_at_"SEL$14") NO_ACCESS(_at_"SEL$13" "from$_subquery$_023"_at_"SEL$13") INDEX_RS_ASC(_at_"SEL$C1A0C3CD" "CE"_at_"SEL$2" ("COVERED_ENTITY"."COVERED_ENTITY_ID")) INDEX(_at_"SEL$C1A0C3CD" "PM"_at_"SEL$1" ("PERSON_MAP"."PERSON_ID"("EPISODE_OF_CARE_PHYSICIAN"."EPISODE_OF_CARE_ID"
"PERSON_MAP"."HOSPITAL_ID" "PERSON_MAP"."ID" "PERSON_MAP"."TABLE_OID"))
INDEX_RS_ASC(_at_"SEL$C1A0C3CD" "E"_at_"SEL$1" ("EVENT"."ID"
"EVENT"."TABLE_OID" "EVENT"."ELIGIBLE" "EVENT"."EFFECTIVE_ON"))
BATCH_TABLE_ACCESS_BY_ROWID(_at_"SEL$C1A0C3CD" "E"_at_"SEL$1") INDEX(_at_"SEL$C1A0C3CD" "ETE"_at_"SEL$4" "EVENT_TO_EPISODE_PKX") INDEX(_at_"SEL$C1A0C3CD" "EOC"_at_"SEL$5" ("EPISODE_OF_CARE"."EPISODE_OF_CARE_ID" "EPISODE_OF_CARE"."HOSPITAL_ID")) INDEX(_at_"SEL$C1A0C3CD" "EOCP"_at_"SEL$6"
"EPISODE_OF_CARE_PHYSICIAN"."PDB_PHYSICIAN_LOOKUP_ID"))
INDEX_RS_ASC(_at_"SEL$C1A0C3CD" "PL"_at_"SEL$7" ("PHYSICIAN_LOOKUP"."PHYSICIAN_LOOKUP_ID"))
NO_ACCESS(_at_"SEL$C1A0C3CD" "VW_LAT_8ED7E3D6"_at_"SEL$8ED7E3D6") INDEX(_at_"SEL$C1A0C3CD" "PACM"_at_"SEL$8" "PHYSICIAN_APPROVED_CE2_IDX1") NO_ACCESS(_at_"SEL$C1A0C3CD" "ERX_PRESCRIBER"_at_"SEL$9") LEADING(_at_"SEL$C1A0C3CD" "CE"_at_"SEL$2" "PM"@"SEL$1" "E"@"SEL$1"
"ETE"_at_"SEL$4" "EOC"_at_"SEL$5" "EOCP"@"SEL$6" "PL"@"SEL$7"
"VW_LAT_8ED7E3D6"_at_"SEL$8ED7E3D6" "PACM"_at_"SEL$8"
"ERX_PRESCRIBER"_at_"SEL$9")
USE_NL(_at_"SEL$C1A0C3CD" "PM"_at_"SEL$1") USE_NL(_at_"SEL$C1A0C3CD" "E"_at_"SEL$1") USE_NL(_at_"SEL$C1A0C3CD" "ETE"_at_"SEL$4") USE_NL(_at_"SEL$C1A0C3CD" "EOC"_at_"SEL$5") USE_NL(_at_"SEL$C1A0C3CD" "EOCP"_at_"SEL$6") USE_NL(_at_"SEL$C1A0C3CD" "PL"_at_"SEL$7") USE_NL(_at_"SEL$C1A0C3CD" "VW_LAT_8ED7E3D6"_at_"SEL$8ED7E3D6") USE_NL(_at_"SEL$C1A0C3CD" "PACM"_at_"SEL$8") USE_NL(_at_"SEL$C1A0C3CD" "ERX_PRESCRIBER"_at_"SEL$9") USE_HASH_AGGREGATION(_at_"SEL$C1A0C3CD") INDEX_RS_ASC(_at_"SEL$1DABF674" "ERX"_at_"SEL$3" ("E_PRESCRIPTION"."E_PRESCRIPTION_ID")) INDEX_RS_ASC(_at_"SEL$2E6E727F" "PDB_MAP"_at_"SEL$10" ("PDB_MAP"."ID"
"PDB_MAP"."TABLE_OID" "PDB_MAP"."PDB_PHYSICIAN_LOOKUP_ID"))
BATCH_TABLE_ACCESS_BY_ROWID(_at_"SEL$2E6E727F" "PDB_MAP"_at_"SEL$10") INDEX(_at_"SEL$2E6E727F" "MAP_PL"_at_"SEL$10" ("PHYSICIAN_LOOKUP"."PHYSICIAN_LOOKUP_ID")) LEADING(_at_"SEL$2E6E727F" "PDB_MAP"_at_"SEL$10" "MAP_PL"@"SEL$10") USE_NL(_at_"SEL$2E6E727F" "MAP_PL"_at_"SEL$10") NLJ_BATCHING(_at_"SEL$2E6E727F" "MAP_PL"_at_"SEL$10") END_OUTLINE_DATA
*/
On Sat, May 2, 2020 at 5:25 PM Frits Hoogland <frits.hoogland_at_gmail.com> wrote:
> Hi Chris, sorry if my response sounds like nitpicking, but I try to > understand what you are saying. > > The query text gets hashed and based on the hash it gets a parent handle > allocated. > Any query with the same text will end up at this parent handle. > SQL cursor parent handles do not execute, a child cursor is needed to > execute, and has an execution plan with it. > > You said: > > Secondary connections to this node by clients with WE8MSWIN1252 are > getting a child cursor with "LANGUAGE_MISMATCH". > > Normally this SQL_ID only ever has 2 plans and no child cursors but due to > the time of day we were working and batch jobs running, we had multiple > servers parsing this SQL_ID after purging it. (nightly jobs) > > A SQL_ID is another representation of the hash value (I believe Tanel > Poder found this). > I do not understand how a parent cursor / hash value / SQL_ID has 2 plans > and no child cursors. > The only way for identical SQL, thus a parent handle, thus a SQL_ID to > execute alias do something is when it has a child handle (which is what I > assume you mean with ‘child cursor’). > The only way for identical SQL to have 2 plans is to have 2 children. > Of course a parent can have a child at one time, which gets purged and > then later has another child with another execution plan. > > The language for an oracle client might be picked up from the OS language > setting, so if you got different operating systems the languages might be > different. But that is an hypothesis. > > You could validate your language assumption by connecting from the > different servers using an interactive client tool and execute ’select * > from nls_session_parameters’ to see the settings, and then let them execute > something identical and see if they get their own children. > > The sharing criteria from v$sql_shared_cursor are properties of a child > cursor to prevent from using a child cursor that could lead to differences > in any form. > One “trick” I use to get another child is to set a session parameter to a > different value (I use db_file_multiblock_read_count for example). > The way it works is it looks up the parent handle, then obtains the child > (pointer) list in the parent’s heap 0, and scans the child list from the > most recent child to the oldest. > If it finds a suitable/compatible child, it will use it, if it doesn’t, it > will start creating a new one. > > For every child it’s investigating, it looks at all the properties that > could mean the child is not the most optimal child to use. That is what > v$sql_shared_cursor shows. > > > > Frits Hoogland > > http://fritshoogland.wordpress.com > frits.hoogland_at_gmail.com > Mobile: +31 6 14180860 > > On 2 May 2020, at 19:12, Chris Taylor <christopherdtaylor1994_at_gmail.com> > wrote: > > Env: > 12.1.0.2 x86-64 > Exadata / OS: Oracle Linux / 4-Node RAC > Latest Patches > > Background: > This week we added an index for a very important ETL process to drive > execution time down. > Subsequently we had a performance change in data ingestion process (100k > records per hour) for a SQL_ID. > > Ultimately, we put in 2 SQL Profiles and 2 SQL Baselines as the plan for > the query will rotate between 2 acceptable plans and varies by day as there > is a nightly job that rebuilds a temp table and repoints a synonym. > (PL/SQL now exists to rotate the 2 SQL profiles based on the synonym). > > The INTERESTING part is that no matter what, we were still getting > suboptimal child plans/cursors that were driving me a bit bananas. Even > after flushing/invalidating all the plans for that particular SQL id. > > This morning, I wanted to take a fresh look at it and approach it from a > cursor level and understand what was different. > > Which led me to GV$SQL_SHARED_CURSOR and mismatch information. (Which I > was familiar with already but due to the "fire" of this performance problem > hadn't had time to adequately research). > > The MISMATCH for this cursor was on LANGUAGE_MISMATCH. I was like, "Hey, > that's very interesting. Why is that?" > > Turns out we have 2 separate (or more) client servers that go through the > same code, and execute the same SQL. > > The primary connecting server doesn't set NLS_LANG or OS LANG variable at > all. And those sessions come in as WE8MSWIN1252. > > The other connecting servers have LANG=en_US.UTF-8 > > In this particular case the SQL_ID was being parsed first by clients > connecting with the en_US.UTF-8 variable set. > > Secondary connections to this node by clients with WE8MSWIN1252 are > getting a child cursor with "LANGUAGE_MISMATCH". > > Normally this SQL_ID only ever has 2 plans and no child cursors but due to > the time of day we were working and batch jobs running, we had multiple > servers parsing this SQL_ID after purging it. (nightly jobs) > > I *theorize *that UTF-8 plans are unsharable with clients who connect as > WE8MSWIN1252 .... but SQLs that are *first *parsed by WE8MSWIN1252 > connected sessions *can be *re-used by clients that come in using UTF-8 . > > *Does that sound like a reasonable theory?* > > I'm currently waiting for for the last 3 remaining batch jobs to finish > their work so that I can flush the SQL_ID again and I expect that will > return us to our normal behavior of only seeing 1 of 2 plans every day. > > (Now, if this particular pot would hurry up and boil, I'd be able to tell > for sure....) > > Chris > > > > >
-- http://www.freelists.org/webpage/oracle-lReceived on Sun May 03 2020 - 00:11:07 CEST