Re: Interesting performance challege (persistent child cursor mis-match)

From: Chris Taylor <christopherdtaylor1994_at_gmail.com>
Date: Sat, 2 May 2020 18:11:07 -0400
Message-ID: <CAP79kiRSSJpm7cRcP1ujRK9PwxcYyYA1Yqfh+HD9dsYOX4E39w_at_mail.gmail.com>



I apologize for not being clear.

The PHVs are children of cursor 0 -- I don't know what the correct term is here.

So Cursor #0 (PHV #1) is generated by a client using UTF-8.

      Cursor #1 (PHV #2) is generated by a client using WE8MSWIN1252

The reason listed as the mismatch was LANGUAGE_MISMATCH.

I found an Oracle Doc with NLS_LANG differences that the cursors are not shareable.

However, now the water is even muddier.

Oracle still wasn't using the correct PHV even when I had one client connected and flushing the sql_id from all the nodes.

So, I completely disabled the sql_profile and the baselines and purged the SQL_ID (again). Now Oracle is using the correct PHV on its own. If I take the OUTLINE hint from the display cursor for the SQL_ID (& the good PHV) and execute it in a 10046 trace , I get the POOR plan using the SAME outline data from the good plan.

I have no idea what's up with that.

Here's the 2 PHVs of interest and their timings:

PLAN_HASH_VALUE EXECUTIONS_TOTAL AVG_ET_SECS

--------------- ---------------- -----------
     3208801841       1.2318E+10   .00021384   ** This is the good one
     2040784329        103444982   .01711316   -- This is a child cursor of
the above with a different exec plan

This is the OUTLINE hint from the good phv and if I pop that hint into the SQL, it gives me the bad plan above. I don't get it.

Again, Oracle is now picking the correct plan with no profile and no baselines but will NOT use it when a profile exists for it .

It's like something is "wrong" with the extracted outline data for that PHV (and thus the sql_profile is garbage as well as that info is stored in the profile)

Outline Data


  /*+

      BEGIN_OUTLINE_DATA
      IGNORE_OPTIM_EMBEDDED_HINTS
      OPTIMIZER_FEATURES_ENABLE('12.1.0.2')
      DB_VERSION('12.1.0.2')
      OPT_PARAM('optimizer_dynamic_sampling' 0)
      OPT_PARAM('_optimizer_use_feedback' 'false')
      OPT_PARAM('_px_adaptive_dist_method' 'off')
      OPT_PARAM('_optimizer_strans_adaptive_pruning' 'false')
      OPT_PARAM('_optimizer_gather_feedback' 'false')
      OPT_PARAM('_optimizer_nlj_hj_adaptive_join' 'false')
      OPT_PARAM('_fix_control' '20243268:1 16732417:1 26664361:7')
      ALL_ROWS
      OUTLINE_LEAF(_at_"SEL$1DABF674")
      PUSH_PRED(_at_"SEL$C1A0C3CD" "VW_LAT_8ED7E3D6"_at_"SEL$8ED7E3D6" 28)
      OUTLINE_LEAF(_at_"SEL$2E6E727F")
      PUSH_PRED(_at_"SEL$C1A0C3CD" "ERX_PRESCRIBER"_at_"SEL$9" 11 10 9)
      OUTLINE_LEAF(_at_"SEL$C1A0C3CD")
      DECORRELATE(_at_"SEL$98BE7483")
      OUTLINE_LEAF(_at_"SEL$13")
      OUTLINE_LEAF(_at_"SEL$DD6B2BE9")
      MERGE(_at_"SEL$14")
      OUTLINE(_at_"SEL$98BE7483")
      DECORRELATE(_at_"SEL$CC45FCA3")
      OUTLINE(_at_"SEL$C1A0C3CD")
      DECORRELATE(_at_"SEL$98BE7483")
      OUTLINE(_at_"SEL$CCA5E655")
      MERGE(_at_"SEL$10")
      OUTLINE(_at_"SEL$790019EA")
      MERGE(_at_"SEL$4FE9DA99")
      OUTLINE(_at_"SEL$15")
      OUTLINE(_at_"SEL$14")
      OUTLINE(_at_"SEL$CC45FCA3")
      OUTLINE(_at_"SEL$11")
      OUTLINE(_at_"SEL$10")
      OUTLINE(_at_"SEL$45011088")
      ANSI_REARCH(_at_"SEL$12")
      OUTLINE(_at_"SEL$4FE9DA99")
      MERGE(_at_"SEL$0C03B516")
      OUTLINE(_at_"SEL$8ED7E3D6")
      OUTLINE(_at_"SEL$12")
      OUTLINE(_at_"SEL$BE1345CD")
      ANSI_REARCH(_at_"SEL$3C7AEB86")
      OUTLINE(_at_"SEL$0C03B516")
      MERGE(_at_"SEL$FCF0FA70")
      OUTLINE(_at_"SEL$3")
      OUTLINE(_at_"SEL$3C7AEB86")
      ANSI_REARCH(_at_"SEL$9")
      OUTLINE(_at_"SEL$5ADDCBD7")
      ANSI_REARCH(_at_"SEL$123A543A")
      OUTLINE(_at_"SEL$FCF0FA70")
      MERGE(_at_"SEL$130B1C8C")
      OUTLINE(_at_"SEL$9")
      OUTLINE(_at_"SEL$123A543A")
      ANSI_REARCH(_at_"SEL$8")
      OUTLINE(_at_"SEL$6CECFF5D")
      ANSI_REARCH(_at_"SEL$BBA4C8A8")
      OUTLINE(_at_"SEL$130B1C8C")
      MERGE(_at_"SEL$D7512574")
      OUTLINE(_at_"SEL$8")
      OUTLINE(_at_"SEL$BBA4C8A8")
      ANSI_REARCH(_at_"SEL$7")
      OUTLINE(_at_"SEL$6C34F147")
      ANSI_REARCH(_at_"SEL$0DBA457E")
      OUTLINE(_at_"SEL$D7512574")
      MERGE(_at_"SEL$0B841A50")
      OUTLINE(_at_"SEL$7")
      OUTLINE(_at_"SEL$0DBA457E")
      ANSI_REARCH(_at_"SEL$6")
      OUTLINE(_at_"SEL$F68A4B96")
      ANSI_REARCH(_at_"SEL$7D6EB0FC")
      OUTLINE(_at_"SEL$0B841A50")
      MERGE(_at_"SEL$82FA1C20")
      OUTLINE(_at_"SEL$6")
      OUTLINE(_at_"SEL$7D6EB0FC")
      ANSI_REARCH(_at_"SEL$5")
      OUTLINE(_at_"SEL$D38AA093")
      ANSI_REARCH(_at_"SEL$88675C8F")
      OUTLINE(_at_"SEL$82FA1C20")
      MERGE(_at_"SEL$58A6D7F6")
      OUTLINE(_at_"SEL$5")
      OUTLINE(_at_"SEL$88675C8F")
      ANSI_REARCH(_at_"SEL$4")
      OUTLINE(_at_"SEL$0D1A3F46")
      ANSI_REARCH(_at_"SEL$8ED7E3D6")
      OUTLINE(_at_"SEL$58A6D7F6")
      MERGE(_at_"SEL$1")
      OUTLINE(_at_"SEL$4")
      OUTLINE(_at_"SEL$2")
      OUTLINE(_at_"SEL$1")
      NO_ACCESS(_at_"SEL$DD6B2BE9" "from$_subquery$_024"_at_"SEL$14")
      NO_ACCESS(_at_"SEL$13" "from$_subquery$_023"_at_"SEL$13")
      INDEX_RS_ASC(_at_"SEL$C1A0C3CD" "CE"_at_"SEL$2"
("COVERED_ENTITY"."COVERED_ENTITY_ID"))
      INDEX(_at_"SEL$C1A0C3CD" "PM"_at_"SEL$1" ("PERSON_MAP"."PERSON_ID"

"PERSON_MAP"."HOSPITAL_ID" "PERSON_MAP"."ID" "PERSON_MAP"."TABLE_OID"))
INDEX_RS_ASC(_at_"SEL$C1A0C3CD" "E"_at_"SEL$1" ("EVENT"."ID"
"EVENT"."TABLE_OID" "EVENT"."ELIGIBLE" "EVENT"."EFFECTIVE_ON"))
BATCH_TABLE_ACCESS_BY_ROWID(_at_"SEL$C1A0C3CD" "E"_at_"SEL$1") INDEX(_at_"SEL$C1A0C3CD" "ETE"_at_"SEL$4" "EVENT_TO_EPISODE_PKX") INDEX(_at_"SEL$C1A0C3CD" "EOC"_at_"SEL$5" ("EPISODE_OF_CARE"."EPISODE_OF_CARE_ID" "EPISODE_OF_CARE"."HOSPITAL_ID")) INDEX(_at_"SEL$C1A0C3CD" "EOCP"_at_"SEL$6"
("EPISODE_OF_CARE_PHYSICIAN"."EPISODE_OF_CARE_ID"
"EPISODE_OF_CARE_PHYSICIAN"."PDB_PHYSICIAN_LOOKUP_ID"))
      INDEX_RS_ASC(_at_"SEL$C1A0C3CD" "PL"_at_"SEL$7" ("PHYSICIAN_LOOKUP"."PHYSICIAN_LOOKUP_ID"))
      NO_ACCESS(_at_"SEL$C1A0C3CD" "VW_LAT_8ED7E3D6"_at_"SEL$8ED7E3D6")
      INDEX(_at_"SEL$C1A0C3CD" "PACM"_at_"SEL$8" "PHYSICIAN_APPROVED_CE2_IDX1")
      NO_ACCESS(_at_"SEL$C1A0C3CD" "ERX_PRESCRIBER"_at_"SEL$9")
      LEADING(_at_"SEL$C1A0C3CD" "CE"_at_"SEL$2" "PM"@"SEL$1" "E"@"SEL$1"

"ETE"_at_"SEL$4" "EOC"_at_"SEL$5" "EOCP"@"SEL$6" "PL"@"SEL$7"
"VW_LAT_8ED7E3D6"_at_"SEL$8ED7E3D6" "PACM"_at_"SEL$8"
"ERX_PRESCRIBER"_at_"SEL$9")
USE_NL(_at_"SEL$C1A0C3CD" "PM"_at_"SEL$1") USE_NL(_at_"SEL$C1A0C3CD" "E"_at_"SEL$1") USE_NL(_at_"SEL$C1A0C3CD" "ETE"_at_"SEL$4") USE_NL(_at_"SEL$C1A0C3CD" "EOC"_at_"SEL$5") USE_NL(_at_"SEL$C1A0C3CD" "EOCP"_at_"SEL$6") USE_NL(_at_"SEL$C1A0C3CD" "PL"_at_"SEL$7") USE_NL(_at_"SEL$C1A0C3CD" "VW_LAT_8ED7E3D6"_at_"SEL$8ED7E3D6") USE_NL(_at_"SEL$C1A0C3CD" "PACM"_at_"SEL$8") USE_NL(_at_"SEL$C1A0C3CD" "ERX_PRESCRIBER"_at_"SEL$9") USE_HASH_AGGREGATION(_at_"SEL$C1A0C3CD") INDEX_RS_ASC(_at_"SEL$1DABF674" "ERX"_at_"SEL$3" ("E_PRESCRIPTION"."E_PRESCRIPTION_ID")) INDEX_RS_ASC(_at_"SEL$2E6E727F" "PDB_MAP"_at_"SEL$10" ("PDB_MAP"."ID"
"PDB_MAP"."TABLE_OID" "PDB_MAP"."PDB_PHYSICIAN_LOOKUP_ID"))
BATCH_TABLE_ACCESS_BY_ROWID(_at_"SEL$2E6E727F" "PDB_MAP"_at_"SEL$10") INDEX(_at_"SEL$2E6E727F" "MAP_PL"_at_"SEL$10" ("PHYSICIAN_LOOKUP"."PHYSICIAN_LOOKUP_ID")) LEADING(_at_"SEL$2E6E727F" "PDB_MAP"_at_"SEL$10" "MAP_PL"@"SEL$10") USE_NL(_at_"SEL$2E6E727F" "MAP_PL"_at_"SEL$10") NLJ_BATCHING(_at_"SEL$2E6E727F" "MAP_PL"_at_"SEL$10") END_OUTLINE_DATA

  */

On Sat, May 2, 2020 at 5:25 PM Frits Hoogland <frits.hoogland_at_gmail.com> wrote:

> Hi Chris, sorry if my response sounds like nitpicking, but I try to
> understand what you are saying.
>
> The query text gets hashed and based on the hash it gets a parent handle
> allocated.
> Any query with the same text will end up at this parent handle.
> SQL cursor parent handles do not execute, a child cursor is needed to
> execute, and has an execution plan with it.
>
> You said:
>
> Secondary connections to this node by clients with WE8MSWIN1252 are
> getting a child cursor with "LANGUAGE_MISMATCH".
>
> Normally this SQL_ID only ever has 2 plans and no child cursors but due to
> the time of day we were working and batch jobs running, we had multiple
> servers parsing this SQL_ID after purging it.  (nightly jobs)
>
> A SQL_ID is another representation of the hash value (I believe Tanel
> Poder found this).
> I do not understand how a parent cursor / hash value / SQL_ID has 2 plans
> and no child cursors.
> The only way for identical SQL, thus a parent handle, thus a SQL_ID to
> execute alias do something is when it has a child handle (which is what I
> assume you mean with ‘child cursor’).
> The only way for identical SQL to have 2 plans is to have 2 children.
> Of course a parent can have a child at one time, which gets purged and
> then later has another child with another execution plan.
>
> The language for an oracle client might be picked up from the OS language
> setting, so if you got different operating systems the languages might be
> different. But that is an hypothesis.
>
> You could validate your language assumption by connecting from the
> different servers using an interactive client tool and execute ’select *
> from nls_session_parameters’ to see the settings, and then let them execute
> something identical and see if they get their own children.
>
> The sharing criteria from v$sql_shared_cursor are properties of a child
> cursor to prevent from using a child cursor that could lead to differences
> in any form.
> One “trick” I use to get another child is to set a session parameter to a
> different value (I use db_file_multiblock_read_count for example).
> The way it works is it looks up the parent handle, then obtains the child
> (pointer) list in the parent’s heap 0, and scans the child list from the
> most recent child to the oldest.
> If it finds a suitable/compatible child, it will use it, if it doesn’t, it
> will start creating a new one.
>
> For every child it’s investigating, it looks at all the properties that
> could mean the child is not the most optimal child to use. That is what
> v$sql_shared_cursor shows.
>
>
>
> Frits Hoogland
>
> http://fritshoogland.wordpress.com
> frits.hoogland_at_gmail.com
> Mobile: +31 6 14180860
>
> On 2 May 2020, at 19:12, Chris Taylor <christopherdtaylor1994_at_gmail.com>
> wrote:
>
> Env:
> 12.1.0.2 x86-64
> Exadata / OS: Oracle Linux  / 4-Node RAC
> Latest Patches
>
> Background:
> This week we added an index for a very important ETL process to drive
> execution time down.
> Subsequently we had a performance change in data ingestion process (100k
> records per hour) for a SQL_ID.
>
> Ultimately, we put in 2 SQL Profiles and 2 SQL Baselines as the plan for
> the query will rotate between 2 acceptable plans and varies by day as there
> is a nightly job that rebuilds a temp table and repoints a synonym.
> (PL/SQL now exists to rotate the 2 SQL profiles based on the synonym).
>
> The INTERESTING part is that no matter what, we were still getting
> suboptimal child plans/cursors that were driving me a bit bananas.  Even
> after flushing/invalidating all the plans for that particular SQL id.
>
> This morning, I wanted to take a fresh look at it and approach it from a
> cursor level and understand what was different.
>
> Which led me to GV$SQL_SHARED_CURSOR and mismatch information. (Which I
> was familiar with already but due to the "fire" of this performance problem
> hadn't had time to adequately research).
>
> The MISMATCH for this cursor was on LANGUAGE_MISMATCH.  I was like, "Hey,
> that's very interesting.  Why is that?"
>
> Turns out we have 2 separate (or more) client servers that go through the
> same code, and execute the same SQL.
>
> The primary connecting server doesn't set NLS_LANG or OS LANG variable at
> all.  And those sessions come in as WE8MSWIN1252.
>
> The other connecting servers have LANG=en_US.UTF-8
>
> In this particular case the SQL_ID was being parsed first by clients
> connecting with the en_US.UTF-8 variable set.
>
> Secondary connections to this node by clients with WE8MSWIN1252 are
> getting a child cursor with "LANGUAGE_MISMATCH".
>
> Normally this SQL_ID only ever has 2 plans and no child cursors but due to
> the time of day we were working and batch jobs running, we had multiple
> servers parsing this SQL_ID after purging it.  (nightly jobs)
>
> I *theorize *that UTF-8 plans are unsharable with clients who connect as
> WE8MSWIN1252 .... but SQLs that are *first *parsed by WE8MSWIN1252
> connected sessions *can be *re-used by clients that come in using UTF-8 .
>
> *Does that sound like a reasonable theory?*
>
> I'm currently waiting for for the last 3 remaining batch jobs to finish
> their work so that I can flush the SQL_ID again and I expect that will
> return us to our normal behavior of only seeing 1 of 2 plans every day.
>
> (Now, if this particular pot would hurry up and boil, I'd be able to tell
> for sure....)
>
> Chris
>
>
>
>
>


--
http://www.freelists.org/webpage/oracle-l
Received on Sun May 03 2020 - 00:11:07 CEST

Original text of this message