SQL Performance Problem between 2 Databases WITH FIX included for this case

From: Taylor, Chris David <ChrisDavid.Taylor_at_ingrambarge.com>
Date: Mon, 16 Jan 2012 16:04:14 -0600
Message-ID: <C5533BD628A9524496D63801704AE56D6ADB9795D1_at_SPOBMEXC14.adprod.directory>



I ran into an issue today where I have 2 databases running the same query.

Both databases are **nearly identical in data** and **are identical in parameters** (except for string specific parameters such as paths and database names etc).

One database is only 1 month newer than the other database.

After banging my head all day on verifying the plans were the same, trying different hints etc, I couldn't not figure out WHY it was slow in D2. I tried different stats...everything. I even tried pulling crazy hints out and trying things.

Then, I got smart. (And I feel like an idiot for taking so long to get around to this)

D1 is not used much right now. The projects in it are on the back burner. D2, however, is very much in use by many developers and running lots and lots of queries.

Can you guess the problem?

It was in my 10046 trace files the whole time - and it's really simple to see it. Though it took me another minute or so to understand WHY.

Below are the stats - skip to the end for the resolution.

(I apologize for the ugly formatting - is there *any* way to do this better?)

STATSs of query in question:

Query in D1 database:



call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.09       0.11          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      4.70      10.09      33331      28943          0           0

------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 4.79 10.21 33331 28943 0 0

Query in D2 database:



call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1      0.00       0.00          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1     15.37      60.07     468652      29060        114           0

------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 3 15.37 60.07 468652 29060 114 0

Query Row Counts in D1 Database:



Rows Row Source Operation
-------  ---------------------------------------------------
      0  FILTER  (cr=28943 pr=33331 pw=7112 time=10098688 us)
   1339 NESTED LOOPS OUTER (cr=28943 pr=33331 pw=7112 time=7530868 us)    1339 NESTED LOOPS OUTER (cr=26242 pr=33331 pw=7112 time=7501334 us)
   1339     NESTED LOOPS OUTER (cr=23558 pr=32764 pw=7112 time=6256312 us)
   1339      NESTED LOOPS OUTER (cr=20872 pr=32764 pw=7112 time=6171926 us)
   1339       HASH JOIN  (cr=18190 pr=32004 pw=7112 time=4135902 us)
 518850        INDEX FAST FULL SCAN PR_REVENUE_ACTIVITY_IDX03 (cr=9095 pr=0 pw=0 time=1037890 us)(object id 2503724)
 518850        INDEX FAST FULL SCAN PR_REVENUE_ACTIVITY_IDX03 (cr=9095 pr=0 pw=0 time=1037817 us)(object id 2503724)
   1339       INDEX RANGE SCAN PR_INVOICE_LINE_IDX03 (cr=2682 pr=760 pw=0 time=658383 us)(object id 2910332)
   1339      INDEX RANGE SCAN PR_INVOICE_HEADER_IDX02 (cr=2686 pr=0 pw=0 time=35403 us)(object id 2834790)
   1339     INDEX RANGE SCAN PR_INVOICE_LINE_IDX03 (cr=2684 pr=567 pw=0 time=506695 us)(object id 2910332)
   1339 INDEX RANGE SCAN PR_INVOICE_HEADER_IDX02 (cr=2701 pr=0 pw=0 time=30211 us)(object id 2834790)

Query Row Counts in D2 Database:



Rows Row Source Operation
-------  ---------------------------------------------------
      0  FILTER  (cr=29060 pr=468652 pw=7210 time=60072675 us)
   1344 NESTED LOOPS OUTER (cr=29060 pr=468652 pw=7210 time=67595724 us)    1344 NESTED LOOPS OUTER (cr=26349 pr=468652 pw=7210 time=67559437 us)
   1344     NESTED LOOPS OUTER (cr=23655 pr=468652 pw=7210 time=67529870 us)
   1344      NESTED LOOPS OUTER (cr=20959 pr=468652 pw=7210 time=67494910 us)
   1344       HASH JOIN  (cr=18267 pr=468652 pw=7210 time=67439729 us)
 529977        INDEX FAST FULL SCAN PR_REVENUE_ACTIVITY_IDX03 (cr=9095 pr=0 pw=0 time=1060226 us)(object id 2503724)
 529977        INDEX FAST FULL SCAN PR_REVENUE_ACTIVITY_IDX03 (cr=9095 pr=0 pw=0 time=530105 us)(object id 2503724)
   1344       INDEX RANGE SCAN PR_INVOICE_LINE_IDX03 (cr=2692 pr=0 pw=0 time=36433 us)(object id 3057250)
   1344      INDEX RANGE SCAN PR_INVOICE_HEADER_IDX02 (cr=2696 pr=0 pw=0 time=23656 us)(object id 2834790)
   1344     INDEX RANGE SCAN PR_INVOICE_LINE_IDX03 (cr=2694 pr=0 pw=0 time=19415 us)(object id 3057250)
   1344 INDEX RANGE SCAN PR_INVOICE_HEADER_IDX02 (cr=2711 pr=0 pw=0 time=23441 us)(object id 2834790)

=HERE= is the key component of the problem - remember, D2 is heavily used by many developers at the moment:

D1:
Elapsed times include waiting on following events:

  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       1        0.00          0.00
  direct path write temp                       1016        0.09          0.44
  latch free                                      1        0.00          0.00
  direct path read temp                        4572        0.03          3.79 <------------------ LOOK
  db file sequential read                       238        0.01          0.55
  db file scattered read                        229        0.01          0.50
  SQL*Net message from client                     1        4.74          4.74
*******************************************************************************

D2:
Elapsed times include waiting on following events:

  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       1        0.00          0.00
  direct path write temp                       1030        0.08          0.18
  db file sequential read                         2        0.00          0.00
  local write wait                              171        0.01          0.25
  direct path read temp                       66950        0.19         39.85  <-----------------------LOOK
  SQL*Net message from client                     1       32.32         32.32
*******************************************************************************

I'm such a dumb*** sometimes.

I was out of (or low in) PGA memory in D2 due to the heavy usage by the development staff.

Bump up PGA_AGGREGATE_TARGET and performance goes back to sub 3 secs for the query. Now D1 and D2 perform exact.

This is WHY a 10046 trace should be FIRST step in examining a performance issue so you can *know* what you are waiting on before even thinking about 10053 traces. (In my honest opinion anyway).

Chris Taylor
Sr. Oracle DBA
Ingram Barge Company
Nashville, TN 37205

"Quality is never an accident; it is always the result of intelligent effort." -- John Ruskin (English Writer 1819-1900)

CONFIDENTIALITY NOTICE: This e-mail and any attachments are confidential and may also be privileged. If you are not the named recipient, please notify the sender immediately and delete the contents of this message without disclosing the contents to anyone, using them for any purpose, or storing or copying the information on any medium.

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Jan 16 2012 - 16:04:14 CST

Original text of this message