RE: In what circumstances might optimizer not choose the lowest cost?

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Tue, 20 Jan 2015 14:56:51 +0000
Message-ID: <CE70217733273F49A8A162EE074F64D928269104_at_EXMBX01.thus.corp>


Not possible to give a definite answer based on the description, but it looks as if you might be in the area of this blog note:

http://jonathanlewis.wordpress.com/2010/09/07/cbo-surprise-3/

Possibly in combination of some aspectes of this one:

https://jonathanlewis.wordpress.com/2011/12/30/fbi-bug-2/

The thing to check, in particular, with the variation with the constraint/predicate is what the optimizer has done with transitive closure - what do the final predicates look like, and do they give you any clues about how the calculations may have changed.

Regards
Jonathan Lewis
http://jonathanlewis.wordpress.com
_at_jloracle



From: oracle-l-bounce_at_freelists.org [oracle-l-bounce_at_freelists.org] on behalf of Kim Berg Hansen [kibeha_at_gmail.com] Sent: 20 January 2015 13:20
To: oracle-l_at_freelists.org
Subject: In what circumstances might optimizer not choose the lowest cost?

Hi, fellows.

I had a weird little case today, which I'll try to describe simply:

Two tables - Tab1 with 4 million rows, Tab2 with 3/4 million rows. Tab1 has a function based index on an expression: "case dataset when 'DAT' then upperalphanum(eksterntvarenr) end" - upperalphanum is a function returning uppercase of input stripped of any whitespace and non-alphanumeric characters. The FBI contains about two hundred thousand of the 4 million rows of Tab1, for the rest the expression is NULL.

Query is a simple join between the two tables joining on a two-column key. There is a predicate on Tab1 on the FBI expression:

   "case dataset when 'DAT' then upperalphanum(eksterntvarenr) end BETWEEN upperalphanum(:bind1) and upperalphanum(:bind2)" And a filter predicate on two columns of Tab2. The access I want (and normally get) is index range scan of the FBI index on Tab1 and nested loop/index access of Tab2. (The whole purpose of the FBI is to have a small fast index for this and other similar queries.)

I have three versions of the query for testing:

Q1: Hinted to use FBI index access on Tab1.
Q2: Hinted to use an alternative normal index on Tab1 containing the columns of the FBI expression, where the expression then will be evaluated for all rows.
Q3: Unhinted (my normal query.)

Apart from hints, the three queries are identical.

Normally they get plans costed like this: Q1 hinted to FBI gets total Cost=26276.
Q2 hinted to normal index gets total Cost=40473. So normal index has a higher cost than FBI. Q3 unhinted picks the lower cost access plan and uses FBI with total Cost=26676.

Then I added a check constraint "check( dataset='DAT' )" on Tab2 on one of the two key columns used for the join. This changed the access plans for the queries - suddenly appeared (optimizer generated) a filter predicate dataset='DAT' on Tab1, as the optimizer know nows via the check constraint on Tab2 and the join between Tab1 and Tab2, that accessing any Tab1 rows with dataset NOT equal to 'DAT' would be folly, because they would be certain to be "thrown away" when we join to Tab1 on dataset column. ("Salted banana", as NoCoug Journal recently called it ;-)

When that filter predicate was added, my three test queries got new costs, of course: Q1 hinted to FBI gets total Cost=24374.
Q2 hinted to normal index gets total Cost=35493. So even with the new filter predicate reducing estimated cardinality (and cost) slightly, normal index is still higher cost than FBI.

BUT... Q3 unhinted picks the HIGHER cost access plan and uses normal index with total Cost=35493 ??

I can understand that my check constraint has a sideeffect of adding a filter predicate. I have also tested dropping the constraint again and instead added the same filter predicate manually to the queries - it gives the same result (so it is not specifically because there's a check constraint.)

What I canNOT understand is, that with the extra filter predicate in place, the optimizer picks the HIGHER costed of the two access plans?

So my question really is:

Are there known circumstances where the optimizer does NOT choose the lowest cost, even though same query with a hint CAN produce a plan with a lower cost?

Or is this "buggy" behaviour? (My version is 11.2.0.3.0 EE.)

Thanks in advance for any hints I can research ;-)

Regards

Kim Berg Hansen

http://dspsd.blogspot.com
kibeha_at_gmail.com<mailto:kibeha_at_gmail.com>
_at_kibeha

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Jan 20 2015 - 15:56:51 CET

Original text of this message