Re: Odd join selectivity

From: Greg Rahn <>
Date: Thu, 6 Mar 2008 21:07:42 -0800
Message-ID: <>

On 3/6/08, Charles Schultz <> wrote:
> It is possible I over-simplified my issue. The one query I posted
> originally is a small part of a much larger query, but the cardinality
> starts to diverge with this one query. They key word in your statement is
> "probably"; when run by itself, the query performs quite quickly (roughly
> one second), even though the cardinality is off "a little". However,
> experience has taught me that queries usually run fastest when the
> cardinality is spot on.

So how much faster do you want to make a 1 second query? Seems like diminishing returns to me. Aren't there other statements that could potentially yield more significant performance gains?

The optimizer's cardinality estimate *does not* have to be spot-on. It needs to be close enough to give you the desired plan. In this case, is there a better performing plan? Have you tried running the query through DBMS_SQLTUNE? Have you experimented with hints?

Another strange artifact of the CBO, I have noticed, is that a cardinality
> that differs merely by 1 can be the difference between a table scan or a
> index access, a nested loop join or a hash. There are a ton of variables
> involved, as I am sure everyone here knows. *grin*

I don't know if I would call this strange. Algorithms may change behavior at some point, right? Having 11 items does technically disqualify you for the 10 items or less check out lane, correct? The difference is only 1 item.

And you are right, this is not normally something that a support analyst
> could (or even should?) field. I am grateful that he has taken up the task
> with COE. And I think your whole point, Greg, is "Why does it matter?"
> Please correct me if I am wrong. It matters to me because I am curious, and
> I want to know. That is the nuts and bolts of it. Perhaps the query would
> run just as fast if the cardinality were equal, or off by one (or two or
> three...). At this point of the game, I just want to know how the difference
> came about in an effort to understand a little better how the CBO works.
> Which helps in the long run to figure out if a query can run faster or not.

While I commend you for you desire to learn and understand, just because you want to know, doesn't necessarily mean you should get an answer. Specifically when it comes to algorithms. I've personally worked on numerous customer issues that were Optimizer related. Some of them were resolved by enhancing the Optimizer calculations. The deliverable to the customer does not take the form of a explanation of a calculation. This is considered intellectual property. It's along the same lines as Google's Page Rank Algorithm. You may want to know why a site is higher in the ranks than another, but there is no obligation for Google to explain why. This is Google's competitive edge. Same with Oracle.

Your comment about "reverse engineering" is quite interesting as well. I do
> not mean to belabor this thread, but it always seems like a game with
> Oracle; they tell you a lot of stuff and leave a whole of stuff unsaid (for
> one reason or another). It is often a challenging, fun game, sometimes
> tedious, and sometimes downright frustrating when you hit a bug or
> "application design" issue. I look forward to more oracle wikis and blogs
> that help me reverse engineer Oracle. *grin*

There really is no game. It's just a matter of what is public, documented information and what is not. If you hit a bug or design issue, Oracle Support will explain to the specific level that is warranted, but this does not require the communication of algorithms or calculations.

I would be cautious with what you read on the Internet (not that I am discouraging research). If it's not official, there is no guarantee it is accurate. There is a old Russian proverb: Доверя́й, но проверя́й (Doveryay, no proveryay), Trust but verify.

While Jonathan Lewis' book is an excellent resource, it was written between September 2003 and June 2005. I'm sure he would comment that he has found numerous changes in the Optimizer's behavior (hopefully for the better) between each of the 9.2 patches, 10.1 and its patches, 10.2 and its patches, and 11.1. The Optimizer algorithms are constantly evolving and may times getting more sophisticated, which is hopefully leading to better plans and faster query execution.

A final word on performance. Performance is goal oriented. There may be a better, faster execution plan out there, but it may not matter because performance is good enough. Chances are there are faster CPUs out there then the ones you are using. Does it matter?


Greg Rahn
Received on Thu Mar 06 2008 - 23:07:42 CST

Original text of this message