Re: Tuning Self-referencing Inserts

From: Hemant K Chitale <hemantkchitale_at_gmail.com>
Date: Mon, 3 Dec 2018 11:26:20 +0800
Message-ID: <CAMNBsZscRtbBu5wFv6C0TqL-sEU1QSLD+s0vta8=PbBERcsHgw_at_mail.gmail.com>



You have two "temporary" tables PS_PSA_ACCTG_TA34 and PS_PSA_ACCTDSTGL4 What we did at a previous project is modify the batch job to gather_stats immediately after such temporary tables are populated *before* they are used in the SELECT portion of an INSERT.

Hemant K Chitale

On Mon, Dec 3, 2018 at 11:08 AM MacGregor, Ian A. < dmarc-noreply_at_freelists.org> wrote:

> I'm afraid I posted some confusing information..
>
>
> When I tried to obtain the plan for the run of a few days ago, the sql_id
> was already gone from the cache. This is why I posted last month's,
> where the behavior was the same.
>
>
> Each time the Insert runs it is assigned a new sql_id, because PeopeSoft
> uses its own process_id as part the where clause. The same statement run
> by the same Oracle session may have a different Peoplesoft process_id
> associated with it. Running this particular PeopleSoft task the insert
> statement is run three diifferent times with three different sql_id's by
> the same Oracle session Each time the run time and the number of buffer
> gets increases. The plan, however, is the same each time the statement is
> run based on the plans hash value.
>
>
> The timing of 11,000 seconds was for the plan posted; i.e both were
> gathered from a month ago. The 1,509,287,736 buffer gets were
> associated with that insert. and not with the whole instance. This was
> the second worse case for that paerticular PeopleSoft run. The next time
> the insert was called it took over 1.8 billion gets to run the same
> insert statement with the same plan and inserting the same number of rows.
> The 1.5 billion are for the second insert execution and the 1.8 billion
> are for the third.
>
>
> I should not have just stuck with last month's information because there
> was an assumption I was trying to conflate the two runs. I apologize tart
> my post caused such confusion.
>
> =====================================================================
>
>
>
> PeopleSoft inserts and deletes rows into what it calls temporary tables
> which are regular table segments. The number of rows in these tables can
> obviously have an impact on the time it takes to run a statement. There
> are several times during the month in which the same statement using the
> same plan runs without a problem. The problem only occurs at month's end.
> The table being inserted into is a "temporary" table. These temporary
> tables my also be truncated. In examining the PeopleSoft trace I saw was
> no indication of truncation of this table although other tables were
> truncated. However some process either deletes the rows or truncates the
> table because the number of rows is currently 0.
>
>
> ========================================================================
>
>
> I posted the library cache pin query as a general question as I was
> surprised to see it all when querying v$session waits. I don't believe it
> is associated with a I could find no associated truncate statement. I was
> wondering if this could be RAC related.
>
>
>
> I hope this clears any confusion. I renew my request for help.
>
>
>
>
>
>
>
>
>
>
>
>
>
> Ian A. MacGregor
> SLAC National Accelerator Laboratory
> Computing Division
> To offer the best IT service at the lab and be the IT provider of choice.
> ------------------------------
> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> on
> behalf of Andy Sayer <andysayer_at_gmail.com>
> *Sent:* Sunday, December 2, 2018 3:15:00 PM
> *To:* Tim Gorman
> *Cc:* ORACLE-L
> *Subject:* Re: Tuning Self-referencing Inserts
>
>
> The table is very small about 64K.
>
>
> What's more important is how much of the table is being read and how
> often.
>
> Here is an explain plan from a month ago.
>
>
> An explain plan from a month ago could be very misleading, you seem to
> have access to the relevant privileges so you should be looking at the real
> execution plan in memory using
>
> select * from table(dbms_xplan.display_cursor(sql_id=><the sql_id>));
>
> The actual explain plan tells us the query is more complicated than a
> simple select * from table.
>
> This should be taken with a pinch of salt as it's an explain plan (so
> unreliable) from a month ago (so the stats would have changed)
> We can see that most of the cost in the query is in the index range scan
> on line 10, it's not often you see an index range scan cost over 1,000 and
> it usually suggests you are using a filter with high index selectivity
> (reading lots of the index) - this is quite a frequent mistake I see, just
> because an index is being used, it doesn't mean it's being used well. The
> predicates section of the plan will tell you exactly what's going on, you
> should always include it (and the notes) when sharing the explain/execution
> plan.
> We can see that there are a few scalar subqueries being used, this tends
> to be an indicator of rushed SQL that can probably be rewritten to allow
> the CBO full power to execute the statement effectively.
> The more rows you have that drive these scalar subqueries, the more times
> they get run, the more work that needs to be done.
>
> It took just over 11,000 seconds to insert 25,000 rows
>
>
> Yeah, that is slow! It's certainly 10,000 times slower than just inserting
> 25,000 rows.
>
> There were 1,509,2 87,736 buffer gets associated with the instance.
>
>
> That as a statement is not helpful, we care about this statement not the
> instance as a whole (with no information of how long this was over)
>
> There is something definitely wrong. This is a 2-node RAC system.
>
>
> The likely truth is that the SQL is bad and it has a hugely inefficient
> execution plan as a result.
>
> FYI the insert is called more than once. It runs pretty well the first
> time then gets dramatically worse.
>
>
> That's no surprise, those scalar subqueries are reading from that table,
> for every row in that table that get's found from one filter you seem to
> read a lot of another index.
>
> The above is the second worse case from about a month ago. There doesn't
> seem to be anything wrong with SQL
>
>
> I would strongly disagree. Yes, you can probably make the statement more
> efficient with more appropriate indexes, but rewriting your scalar
> subqueries to better handle the volume would do wonders.
>
> For the latest run, I decided to do periodic probes of v$session wait.
>
>
> It sounds like v$session_event is what you're after, it stores aggregated
> information from all waits that the session experienced in its lifetime.
>
> Each time I did I saw the library cache pin wait all with "waited short
> time". V$session did not show any blocking session. Neither did the
> query tailored to find blockers for this wait. I did a system state dump
> ...
> is not in a wait:
> ...
>
> Actually we can see that these waits were experienced a long time ago,
> since then the session has just been on CPU (which is probably due to the
> loopy reads of a lot of index which is highly cached).
>
> ...Here are the RAC stats from the run done about a month ago. This
> covers a 10 hour period
> ...
>
>
>
> When you are looking at things at the instance level, it is very difficult
> to see the wood from the trees. Scoping is everything when it comes to
> performance investigations - just like you wouldn't do an AWR report over
> 24 hours when you care about a process that only runs at midday for 5
> minutes, you shouldn't use a full instance AWR report for something running
> in one session.
> Instead, you can manually query v$active_session_history with filters to
> properly drill down. You can even join from v$active_session_history to
> v$sql_plan to see exactly what step of the plan was being executed most of
> the time.
>
> ...
> The db link wait is not associated with this process.
>
> That is the problem with using an unscoped report, there's plenty of
> irrelevant details that you'd need to ignore. Although, the top event is
> CPU by a long way.
>
> In summary:
> Rewrite the query to not use so many scalar subqueries
> Work out why those index range scans are so highly selective
>
> Hope this helps,
> Andy
>
>
> On Sun, 2 Dec 2018 at 18:17, Tim Gorman <tim.evdbt_at_gmail.com> wrote:
>
> Ian,
>
>
> I wrote a PL/SQL package called ASH_XPLAN
> <http://evdbt.com/download/presentation-real-time-sql-tuning-delivered-at-hotsos-symposium-2015-and-utoug-training-days-2015/>
> that combines DBMS_XPLAN with information from V$ACTIVE_SESSION_HISTORY to
> accumulate the ASH time spent in each step of the execution plan of a
> presently-running SQL statement. By running ASH_XPLAN in a separate
> SQL*Plus session repeatedly, I can see which way the wind blows in a very
> short time, without having to wait for the statement to complete. Of
> course, please don't consider accumulated ASH time to be exactly equal to
> elapsed time, as explained by Graham, Uri, and JB
> <http://www.rmoug.org/wp-content/uploads/QEW_Presentations/2012/ASH%20Architecture%20and%20Advanced%20Usage.pdf>
> .
>
>
> I'm aware that the DBMS_SQLTUNE and SQL Monitor functionality (thanks Tim
> Hall!)
> <https://oracle-base.com/articles/11g/real-time-sql-monitoring-11gr1> in
> views like V$SQL_PLAN_STATISTICS does this as well, but I found it fun and
> educational to solve the problem myself. I was lonely for a nice PL/SQL
> project that week. :)
>
>
> Either way, using ASH_XPLAN or DBMS_SQLTUNE, I strongly recommend
> displaying the actual elapsed time for each step of the plan in an actual
> execution to determine where the most time is being spent.
>
>
> Nothing really matters in optimization except elapsed time, everything
> else is explanatory information.
>
>
> Elapsed time (a.k.a. "life") is too short for guessing.
>
>
> Hope this helps...
>
>
> -Tim
>
>
>
>
> On 12/1/18 22:17, MacGregor, Ian A. (Redacted sender ian for DMARC) wrote:
>
> The table is very small about 64K. Here is an explain plan from a
> month ago.
>
>
> -------------------------------------------------------------------------------------------------------
>
>
> | Id | Operation | Name | Rows |
> Bytes | Cost (%CPU)| Time |
>
>
> -------------------------------------------------------------------------------------------------------
>
>
> | 0 | INSERT STATEMENT | | |
> | 2999 (100)| |
>
> | 1 | LOAD TABLE CONVENTIONAL | | |
> | | |
>
> | 2 | HASH UNIQUE | | 1 |
> 477 | 2999 (1)| 00:00:36 |
>
> | 3 | NESTED LOOPS | | 1 |
> 477 | 2994 (1)| 00:00:36 |
>
> | 4 | NESTED LOOPS | | 192 |
> 477 | 2994 (1)| 00:00:36 |
>
> |* 5 | HASH JOIN | | 2 |
> 254 | 6 (0)| 00:00:01 |
>
> | 6 | NESTED LOOPS | | 8 |
> 608 | 3 (0)| 00:00:01 |
>
> |* 7 | TABLE ACCESS FULL | PS_PSA_RULES_HDR | 8 |
> 504 | 3 (0)| 00:00:01 |
>
> |* 8 | INDEX UNIQUE SCAN | PS_PSA_RULES_LVL | 1 |
> 13 | 0 (0)| |
>
> |* 9 | TABLE ACCESS FULL | PS_PSA_RULES_LN | 32 |
> 1632 | 3 (0)| 00:00:01 |
>
>
> |* 10 | INDEX RANGE SCAN | PS_PSA_ACCTG_TA34 | 96 |
> | 1473 (1)| 00:00:18 |
>
> |* 11 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 |
> 20 | 2 (0)| 00:00:01 |
>
> |* 12 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 |
> 14 | 2 (0)| 00:00:01 |
>
> | 13 | SORT AGGREGATE | | 1 |
> 19 | | |
>
> | 14 | TABLE ACCESS BY INDEX ROWID| PS_PSA_ACCTDSTGL4 | 1 |
> 19 | 3 (0)| 00:00:01 |
>
> |* 15 | INDEX RANGE SCAN | PS_PSA_ACCTDSTGL4 | 1 |
> | 2 (0)| 00:00:01 |
>
> |* 16 | TABLE ACCESS BY INDEX ROWID | PS_PSA_ACCTG_TA34 | 1 |
> 350 | 1494 (1)| 00:00:18 |
>
> -------------------------------------------------------------------------------------------------------
>
>
> It took just over 11,000 seconds to insert 25,000 rows There were 1,509,2
> 87,736 buffer gets associated with the instance. There is something
> definitely wrong. This is a 2-node RAC system. FYI the
> insert is called more than once. It runs pretty well the first time then
> gets dramatically worse. The above is the second worse case from about a
> month ago. There doesn't seem to be anything wrong with SQL
>
>
> For the latest run, I decided to do periodic probes of v$session wait.
> Each time I did I saw the library cache pin wait all with "waited short
> time". V$session did not show any blocking session. Neither did
> the query tailored to find blockers for this wait. I did a system state
> dump
>
>
>
> -------------------------------------------------------------------------------
>
> Oracle session identified by:
>
> {
>
> instance: 1 (fsprd.fsprd1)
>
> os id: 28554
>
> process id: 176, oracle_at_erp-fprd-oracle01
>
> session id: 406
>
> session serial #: 251
>
> }
>
> is not in a wait:
>
> {
>
> last wait: 120 min 20 sec ago
>
> blocking: 0 sessions
>
> current sql: INSERT INTO PS_PSA_ACCTDSTGL4 (PROCESS_INSTANCE,
> BUSINESS_UNIT_PC, PROJECT_ID, ACTIVITY_ID, RESOURCE_ID, LINE_SEQ_NBR,
> DEBIT_CREDIT, DST_USE, INTER_ORG_LEVEL, CONVERSION_RATE, ORG_TO_BOOK,
> COMBO_STATUS, PROJECT_FLAG, IU_ANCHOR_FLG, CONTRACT_NUM, CONTRACT_LINE_NUM,
> CONTRACT_PPD_SEQ, ACCT_PLAN_ID, EVENT_NUM, ACCOUNT, AL
>
> short stack:
> ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-qerixtFetch()+547<-subex1()+259<-subsr3()+183<-evaopn3()+2533<-expepr()+576<-evaiand()+51<-expeal()+23<-qerixtFetch()+800<-qerjotRowProc()+359<-qerhjInnerProbeHashTable()+491<-kdstf11001010000km()+617<-kdsttgr()+103608<-qertbFetch()+2455<-rwsfcd()+103<-qerhjFetch()+621<-qerjotFetch()+2025<-qerjotFetch()+2025<-qerghFetch()+315<-rwsfcd()+103<-qerltcFetch()+1223<-insexe()+691<-opiexe()+5632<-kpoal8()+2380<-opiodr()+917<-ttcpip()+2183<-opitsk(
>
> wait history:
>
> 1. event: 'library cache pin'
>
> time waited: 0.000158 sec
>
> wait id: 7433154 p1: 'handle
> address'=0x9c48940d0
>
> p2: 'pin address'=0x5eec1bd08
>
> p3:
> '100*mode+namespace'=0x41cdd00010002
>
> * time between wait #1 and #2: 0.000046 sec
>
> 2. event: 'library cache lock'
>
> time waited: 0.000228 sec
>
> wait id: 7433153 p1: 'handle
> address'=0x9c48940d0
>
> p2: 'lock
> address'=0x5eeef6eb0
>
> p3:
> '100*mode+namespace'=0x41cdd00010002
>
> * time between wait #2 and #3: 0.000516 sec
>
> 3. event: 'library cache pin'
>
> time waited: 0.000250 sec
>
> wait id: 7433152 p1: 'handle
> address'=0x9a2fdaf10
>
> p2: 'pin address'=0x99a42e960
>
> p3:
> '100*mode+namespace'=0x2160300010002
>
> }
>
>
>
>
> ==========================================================================================
>
>
> I thought the lock/pin was against an index, but the 100 indicates it is
> against the table. However despite their showing up when I probed
> v$session_wait they are not shown to be a significant wait. I was
> surprised to see them however, because I associated such locks/pins with
> DDL not DML.
>
>
> The last point is that this is a 2 node RAC system, the database was
> cloned to a single instance database and the program ran much more
> efficiently. The longest time for an insert statement was less than an
> hour.
>
>
> Here are the RAC stats from the run done about a month ago. This covers
> a 10 hour period
>
>
> The top RAC associated wait was ranked 10th
>
>
>
> - *• *
>
> *Event*
>
> *Waits*
>
> *Total Wait Time (sec)*
>
> *Wait Avg(ms)*
>
> *% DB time*
>
> *Wait Class*
>
> DB CPU
>
>
>
> 38K
>
>
>
> 96.6
>
>
>
> SQL*Net message from dblink
>
> 3,038
>
> 497.6
>
> 164
>
> 1.3
>
> Network
>
> db file sequential read
>
> 1,233,886
>
> 474.4
>
> 0
>
> 1.2
>
> User I/O
>
> control file sequential read
>
> 149,191
>
> 96.6
>
> 1
>
> .2
>
> System I/O
>
> direct path read
>
> 71,377
>
> 89.5
>
> 1
>
> .2
>
> User I/O
>
> SQL*Net more data to dblink
>
> 45,096
>
> 51.1
>
> 1
>
> .1
>
> Network
>
> log switch/archive
>
> 21
>
> 43.8
>
> 2086
>
> .1
>
> Other
>
> recovery area: computing obsolete files
>
> 10
>
> 22.3
>
> 2234
>
> .1
>
> Other
>
> log file sync
>
> 33,439
>
> 19.2
>
> 1
>
> .0
>
> Commit
>
> gc current block 2-way
>
> 42,984
>
> 16
>
> 0
>
> .0
>
> Cluster
>
>
> The db link wait is not associated with this process.
>
>
>
>
>
> Ian A. MacGregor
> SLAC National Accelerator Laboratory
> Computing Division
> To offer the best IT service at the lab and be the IT provider of choice.
> ------------------------------
> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org>
> <oracle-l-bounce_at_freelists.org> on behalf of Andy Sayer
> <andysayer_at_gmail.com> <andysayer_at_gmail.com>
> *Sent:* Friday, November 30, 2018 2:56:28 PM
> *To:* dmarc-noreply_at_freelists.org
> *Cc:* ORACLE-L
> *Subject:* Re: Tuning Self-referencing Inserts
>
> I think we need to take a few steps back.
>
> It’s hard to explain why such a statement would cause noticeable issues if
> the table is really not that large. A simple insert into <target> statement
> using a select from <target table>, won’t take a silly amount of time -
> it’s just the time to read the table blocks using a full tablescan then the
> time to update the indexes which might be some overhead but no different to
> any other insert statement of the same volume.
>
> Perhaps this this insert is within some loop? Perhaps the self-reference
> is written so that a silly execution plan is being used. It would be great
> if you could share the execution plan you are seeing and the SQL being
> executed. Is there PL/SQL involved?
>
> One other thing that springs to mind that could have an impact is DML
> error logging (it sounds like a situation where unique keys could give you
> errors?)
>
> As for library cache pins, this is unusual for an insert/select statement.
> Things like running a PL/SQL procedure which another session is trying to
> recompile while another session is executing it will bring around this
> behaviour. Perhaps there is a function being called by your statement.
> Again, what exactly are you seeing? This is a block so you should be able
> to investigate what the blocking session is doing by checking v$session.
>
> Hope this helps,
> Andy
>
> On Fri, 30 Nov 2018 at 18:32, MacGregor, Ian A. <
> dmarc-noreply_at_freelists.org> wrote:
>
> The problem is with a PeopleSoft statement which is based on a select
> statement which references the table being inserted. The problem stems
> from having to read and build the indexes of the table which is not that
> large. The buffer gets are extremely high.
>
>
> Is there generic answer to this problem. Would it help to rebuild with
> a higher percent free in an attempt to have few rows per block so as to
> lessen contention.
>
>
> Also for library cache pins I don't understand how to reads the p3raw
> value. I think the problem primarily lies with the maintenance m of the
> unique index bit I am. not 100% confident.
>
>
> Ian A. MacGregor
> SLAC National Accelerator Laboratory
> Computing Division
> To offer the best IT service at the lab and be the IT provider of choice.
>
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Dec 03 2018 - 04:26:20 CET

Original text of this message