Re: Tuning Self-referencing Inserts

From: Tim Gorman <tim.evdbt_at_gmail.com>
Date: Sun, 2 Dec 2018 10:16:38 -0800
Message-ID: <7ebfea08-8a1b-fcbb-d359-fcb1a2ac3232_at_gmail.com>



Ian,

I wrote a PL/SQL package called ASH_XPLAN <http://evdbt.com/download/presentation-real-time-sql-tuning-delivered-at-hotsos-symposium-2015-and-utoug-training-days-2015/> that combines DBMS_XPLAN with information from V$ACTIVE_SESSION_HISTORY to accumulate the ASH time spent in each step of the execution plan of a presently-running SQL statement. By running ASH_XPLAN in a separate SQL*Plus session repeatedly, I can see which way the wind blows in a very short time, without having to wait for the statement to complete.  Of course, please don't consider accumulated ASH time to be exactly equal to elapsed time, as explained by Graham, Uri, and JB <http://www.rmoug.org/wp-content/uploads/QEW_Presentations/2012/ASH%20Architecture%20and%20Advanced%20Usage.pdf>.

I'm aware that the DBMS_SQLTUNE and SQL Monitor functionality (thanks Tim Hall!)
<https://oracle-base.com/articles/11g/real-time-sql-monitoring-11gr1> in views like V$SQL_PLAN_STATISTICS does this as well, but I found it fun and educational to solve the problem myself.  I was lonely for a nice PL/SQL project that week.  :)

Either way, using ASH_XPLAN or DBMS_SQLTUNE, I strongly recommend displaying the actual elapsed time for each step of the plan in an actual execution to determine where the most time is being spent.

Nothing really matters in optimization except elapsed time, everything else is explanatory information.

Elapsed time (a.k.a. "life") is too short for guessing.

Hope this helps...

-Tim

On 12/1/18 22:17, MacGregor, Ian A. (Redacted sender ian for DMARC) wrote:
>
> The table  is very small  about 64K.   Here is an explain  plan from a
> month ago.
class="quotelev1">>
>
> -------------------------------------------------------------------------------------------------------
>
>
> | Id  | Operation                         | Name   | Rows  | Bytes |
> Cost (%CPU)| Time     |
>
> -------------------------------------------------------------------------------------------------------
>
>
> | 0 | INSERT STATEMENT                  | |       |       |  2999
> (100)|          |
>
> | 1 |  LOAD TABLE CONVENTIONAL          | |       |       |           
> |          |
>
> | 2 |   HASH UNIQUE                     | |     1 |   477 |  2999  
> (1)| 00:00:36 |
>
> | 3 |    NESTED LOOPS                   | |     1 |   477 |  2994  
> (1)| 00:00:36 |
>
> | 4 |     NESTED LOOPS                  | |   192 |   477 |  2994  
> (1)| 00:00:36 |
>
> |* 5 |      HASH JOIN                    | |     2 |   254 |     6  
> (0)| 00:00:01 |
>
> | 6 |       NESTED LOOPS                | |     8 |   608 |     3  
> (0)| 00:00:01 |
>
> |* 7 |        TABLE ACCESS FULL          | PS_PSA_RULES_HDR |     8 |
>   504 |     3   (0)| 00:00:01 |
>
> |* 8 |        INDEX UNIQUE SCAN          | PS_PSA_RULES_LVL |     1 | 
>   13 |     0   (0)|          |
>
> |* 9 |       TABLE ACCESS FULL           | PS_PSA_RULES_LN |    32 | 
> 1632 |     3   (0)| 00:00:01 |
>
> |* 10 |      INDEX RANGE SCAN             | PS_PSA_ACCTG_TA34 |    96
> |       |  1473   (1)| 00:00:18 |
>
> |* 11 |       INDEX RANGE SCAN            | PS_PSA_ACCTDSTGL4 |     1
> |    20 |     2   (0)| 00:00:01 |
>
> |* 12 |       INDEX RANGE SCAN            | PS_PSA_ACCTDSTGL4 |     1
> |    14 |     2   (0)| 00:00:01 |
>
> | 13 |       SORT AGGREGATE              | |     1 |    19 |         
>   |          |
>
> | 14 |        TABLE ACCESS BY INDEX ROWID| PS_PSA_ACCTDSTGL4 |     1
> |    19 |     3   (0)| 00:00:01 |
>
> |* 15 |         INDEX RANGE SCAN          | PS_PSA_ACCTDSTGL4 |     1
> |       |     2   (0)| 00:00:01 |
>
> |* 16 |     TABLE ACCESS BY INDEX ROWID   | PS_PSA_ACCTG_TA34 |     1
> |   350 |  1494   (1)| 00:00:18 |
>
> -------------------------------------------------------------------------------------------------------
>
>
>  It took just over 11,000 seconds to insert 25,000 rows  There were
> 1,509,2 87,736 buffer gets associated  with the instance.  There
> is something  definitely wrong.   This is a 2-node RAC system.  FYI
> the insert is called more than once. It runs pretty well the
> first time  then gets dramatically worse.   The above is the second
> worse case from about a month ago.  There doesn't seem to be anything 
> wrong with  SQL
>
>
> For the latest run,  I decided to  do periodic probes of v$session
> wait.  Each time I did I saw  the library cache pin wait  all with
> "waited short time".  V$session did not show any  blocking session.
>  Neither did  the query tailored to find blockers for this wait.  I
> did a system state dump
>
>
> -------------------------------------------------------------------------------
>
> Oracle session identified by:
>
> {
>
>             instance: 1 (fsprd.fsprd1)
>
>                 os id: 28554
>
>           process id: 176, oracle_at_erp-fprd-oracle01
>
>           session id: 406
>
>     session serial #: 251
>
> }
>
> is not in a wait:
>
> {
>
>             last wait: 120 min 20 sec ago
>
>             blocking: 0 sessions
>
>           current sql: INSERT INTO PS_PSA_ACCTDSTGL4
> (PROCESS_INSTANCE, BUSINESS_UNIT_PC, PROJECT_ID, ACTIVITY_ID,
> RESOURCE_ID, LINE_SEQ_NBR, DEBIT_CREDIT, DST_USE, INTER_ORG_LEVEL,
> CONVERSION_RATE, ORG_TO_BOOK, COMBO_STATUS, PROJECT_FLAG,
> IU_ANCHOR_FLG, CONTRACT_NUM, CONTRACT_LINE_NUM, CONTRACT_PPD_SEQ,
> ACCT_PLAN_ID, EVENT_NUM, ACCOUNT, AL
>
>           short stack:
> ksedsts()+465<-ksdxfstk()+32<-ksdxcb()+1927<-sspuser()+112<-__sighandler()<-qerixtFetch()+547<-subex1()+259<-subsr3()+183<-evaopn3()+2533<-expepr()+576<-evaiand()+51<-expeal()+23<-qerixtFetch()+800<-qerjotRowProc()+359<-qerhjInnerProbeHashTable()+491<-kdstf11001010000km()+617<-kdsttgr()+103608<-qertbFetch()+2455<-rwsfcd()+103<-qerhjFetch()+621<-qerjotFetch()+2025<-qerjotFetch()+2025<-qerghFetch()+315<-rwsfcd()+103<-qerltcFetch()+1223<-insexe()+691<-opiexe()+5632<-kpoal8()+2380<-opiodr()+917<-ttcpip()+2183<-opitsk(
>
>         wait history:
>
>           1.       event: 'library cache pin'
>
>               time waited: 0.000158 sec
>
>                   wait id: 7433154         p1: 'handle
> address'=0x9c48940d0
>
>                                           p2: 'pin address'=0x5eec1bd08
>
>                                           p3:
> '100*mode+namespace'=0x41cdd00010002
>
>           * time between wait #1 and #2: 0.000046 sec
>
>           2.       event: 'library cache lock'
>
>               time waited: 0.000228 sec
>
>                   wait id: 7433153         p1: 'handle
> address'=0x9c48940d0
>
>                                           p2: 'lock address'=0x5eeef6eb0
>
>                                           p3:
> '100*mode+namespace'=0x41cdd00010002
>
>           * time between wait #2 and #3: 0.000516 sec
>
>           3.       event: 'library cache pin'
>
>               time waited: 0.000250 sec
>
>                   wait id: 7433152         p1: 'handle
> address'=0x9a2fdaf10
>
>                                           p2: 'pin address'=0x99a42e960
>
>                                           p3:
> '100*mode+namespace'=0x2160300010002
>
> }
>
> ==========================================================================================
>
>
> I thought the lock/pin was against an index, but  the 100 indicates it
> is against the table.  However  despite their showing up when I probed
> v$session_wait they are not shown to be a significant wait.   I was
> surprised to see  them however, because I  associated such locks/pins
> with DDL not DML.
>
>
> The last point is that this is a 2 node RAC system,  the database was
> cloned to a single instance database and  the program ran much more
> efficiently.  The longest time for an insert statement was less than
> an hour.
>
>
> Here are the RAC stats from the  run done about a month ago.  This
> covers a 10 hour period
>
>
> The top RAC associated  wait was ranked 10th
>
>
> * *•*
>
> *Event*
>
>
>
> *Waits*
>
>
>
> *Total Wait Time (sec)*
>
>
>
> *Wait Avg(ms)*
>
>
>
> *% DB time*
>
>
>
> *Wait Class*
>
> DB CPU
>
>
>
>
>
> 38K
>
>
>
>
>
> 96.6
>
>
>
> SQL*Net message from dblink
>
>
>
> 3,038
>
>
>
> 497.6
>
>
>
> 164
>
>
>
> 1.3
>
>
>
> Network
>
> db file sequential read
>
>
>
> 1,233,886
>
>
>
> 474.4
>
>
>
> 0
>
>
>
> 1.2
>
>
>
> User I/O
>
> control file sequential read
>
>
>
> 149,191
>
>
>
> 96.6
>
>
>
> 1
>
>
>
> .2
>
>
>
> System I/O
>
> direct path read
>
>
>
> 71,377
>
>
>
> 89.5
>
>
>
> 1
>
>
>
> .2
>
>
>
> User I/O
>
> SQL*Net more data to dblink
>
>
>
> 45,096
>
>
>
> 51.1
>
>
>
> 1
>
>
>
> .1
>
>
>
> Network
>
> log switch/archive
>
>
>
> 21
>
>
>
> 43.8
>
>
>
> 2086
>
>
>
> .1
>
>
>
> Other
>
> recovery area: computing obsolete files
>
>
>
> 10
>
>
>
> 22.3
>
>
>
> 2234
>
>
>
> .1
>
>
>
> Other
>
> log file sync
>
>
>
> 33,439
>
>
>
> 19.2
>
>
>
> 1
>
>
>
> .0
>
>
>
> Commit
>
> gc current block 2-way
>
>
>
> 42,984
>
>
>
> 16
>
>
>
> 0
>
>
>
> .0
>
>
>
> Cluster
>
>
> The db link wait is not associated with this process.
>
>
>
>
>
> Ian A. MacGregor
> SLAC National Accelerator Laboratory
> Computing Division
> To offer the best IT service at the lab and be the IT provider of choice.
> ------------------------------------------------------------------------
> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org>
> on behalf of Andy Sayer <andysayer_at_gmail.com>
> *Sent:* Friday, November 30, 2018 2:56:28 PM
> *To:* dmarc-noreply_at_freelists.org
> *Cc:* ORACLE-L
> *Subject:* Re: Tuning Self-referencing Inserts
> I think we need to take a few steps back.
>
> It’s hard to explain why such a statement would cause noticeable
> issues if the table is really not that large. A simple insert into
> <target> statement using a select from <target table>, won’t take a
> silly amount of time - it’s just the time to read the table blocks
> using a full tablescan then the time to update the indexes which might
> be some overhead but no different to any other insert statement of the
> same volume.
>
> Perhaps this this insert is within some loop? Perhaps the
> self-reference is written so that a silly execution plan is being
> used. It would be great if you could share the execution plan you are
> seeing and the SQL being executed. Is there PL/SQL involved?
>
> One other thing that springs to mind that could have an impact is DML
> error logging (it sounds like a situation where unique keys could give
> you errors?)
>
> As for library cache pins, this is unusual for an insert/select
> statement. Things like running a PL/SQL procedure which another
> session is trying to recompile while another session is executing it
> will bring around this behaviour. Perhaps there is a function being
> called by your statement. Again, what exactly are you seeing? This is
> a block so you should be able to investigate what the blocking session
> is doing by checking v$session.
>
> Hope this helps,
> Andy
>
> On Fri, 30 Nov 2018 at 18:32, MacGregor, Ian A.
> <dmarc-noreply_at_freelists.org <mailto:dmarc-noreply_at_freelists.org>> wrote:
>
> The problem is with a PeopleSoft statement which  is based on  a
> select  statement which references the table being inserted.  The
> problem stems from having to read and build the indexes of the
> table which is not that large.   The  buffer gets are extremely high.
>
>
> Is there generic answer to this problem.  Would it help to
>  rebuild  with a higher percent free in an attempt to have few
> rows per block so as to lessen contention.
>
>
> Also for library cache pins I don't understand how to reads the
> p3raw value.  I think the problem primarily lies with the
> maintenance m of the unique  index bit I am. not 100% confident.
>
>
> Ian A. MacGregor
> SLAC National Accelerator Laboratory
> Computing Division
> To offer the best IT service at the lab and be the IT provider of
> choice.
>

--
http://www.freelists.org/webpage/oracle-l
Received on Sun Dec 02 2018 - 19:16:38 CET

Original text of this message