Unexplained Plan Change

From: LB <moabrivers_at_gmail.com>
Date: Mon, 21 Jun 2010 10:53:36 -0600
Message-ID: <AANLkTinAk7hZ-kGvL5LmZVOmBdzcjlzMPFDbci42p7-F_at_mail.gmail.com>



I've got a query that ran this morning that for some reason decided to jump ship off its normal plan and use a different plan. This is a very straightforward single table query. When the query runs normally it uses the correct index and returns results very quickly. This table and its 2 indexes (one PK and one non-unique) have their respective statistics gathered nightly at 1AM. At 8AM this morning, Oracle decided that using the PK index was the best choice and so it gobbled up 22 seconds doing single block I/O. Now I understand that gathering statistics can cause a SQL plan to change but that plan change should remain effect all things being equal.  If it jump shipped because of a statistical gathering change then it should stay the course on the new plan. It doesn't. Explaining the query from several different clients all result in the correct explain plan that uses the non-unique index and not the primary key index. I am unable to reproduce any situation where the PK should be used for this query so I am a bit confused as to why Oracle decided it was the best choice at that moment in time. I also understand that bind peeking was an issue in 10.2.0.3 resolved in 10.2.0.4. Even explaining the plan with just the bind variables results in the correct plan again being chosen. The only recorded wait time was for db file sequential reads (8 seconds on the index datafile, 14 seconds on the table datafile).

The optimizer settings haven't changed and its impossible for the driving application to change its optimizer settings at the session level. The query uses bind variables (we are 10.2.0.4 64-bit on Linux) and I've reconstructed the query's variables using v$sql_bind_capture. As I'm not the code architect behind the generation of this query, I cannot change the query itself.

Here's an example of the query:

The PK index is on COL_A_I_PK and COL_F_PK The non-unique index is on COL_A_I_PK, COL_B_I, COL_C_I, COL_D_I, COL_E_I

You will notice the bulk of predicate filter is repetitive with OR statements on columns B, C, D, and E.

SELECT COUNT(*)
FROM
   (
   SELECT COL_A_I_PK,

      COL_B_I,
      COL_C_I,
      COL_D_I,
      COL_E_I,
      COL_F_PK,
    other_non_indexed_columns....

   FROM T
   WHERE COL_A_I_PK = :1
  AND other_non_indexed_column = :"SYS_B_0"
  AND ((COL_B_I = :2
  AND COL_C_I = :3
  AND COL_D_I = :4
  AND COL_E_I=:5)

   OR (COL_B_I = :6
  AND COL_C_I = :7
  AND COL_D_I = :8
  AND COL_E_I=:9)

   OR (COL_B_I = :10
  AND COL_C_I = :11
  AND COL_D_I = :12
  AND COL_E_I=:13)

   OR (COL_B_I = :14
  AND COL_C_I = :15
  AND COL_D_I = :16
  AND COL_E_I=:17)

   OR (COL_B_I = :18
  AND COL_C_I = :19
  AND COL_D_I = :20
  AND COL_E_I=:21)

   OR (COL_B_I = :22
  AND COL_C_I = :23
  AND COL_D_I = :24
  AND COL_E_I=:25)

   OR (COL_B_I = :26
  AND COL_C_I = :27
  AND COL_D_I = :28
  AND COL_E_I=:29)

   OR (COL_B_I = :30
  AND COL_C_I = :31
  AND COL_D_I = :32
  AND COL_E_I=:33)

   OR (COL_B_I = :34
  AND COL_C_I = :35
  AND COL_D_I = :36
  AND COL_E_I=:37)

   OR (COL_B_I = :38
  AND COL_C_I = :39
  AND COL_D_I = :40
  AND COL_E_I=:41)

   OR (COL_B_I = :42
  AND COL_C_I = :43
  AND COL_D_I = :44
  AND COL_E_I=:45) )

   )

Any thoughts on other places I might look to find out why it changed.  Running a 10053 event trace seems a little too late at this point because the correct plan is again being chosen. This is the first time this query ran today.

Thanks.

--
http://www.freelists.org/webpage/oracle-l
Received on Mon Jun 21 2010 - 11:53:36 CDT

Original text of this message