Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Mailing Lists -> Oracle-L -> RE: CBO with Foreign Key

RE: CBO with Foreign Key

From: Wolfgang Breitling <breitliw_at_centrexcc.com>
Date: Fri, 17 Oct 2003 12:19:25 -0800
Message-ID: <F001.005D381A.20031017121925@fatcity.com>


Ah, but that cardinality underestimation has nothing to do with the join or the foreign key relationship, but solely with the other fallacy of the cbo - the predicate independence assumption. In your example, the predicates a and b are completely dependent; once you choose one, the other is determined, not open to choice anymore. The optimizer has no clue on how to estimate NDV(a,b) from the existing statistics which give it only NDV(a) and NDV(b). As I said before, in Oracle 9i you can use dynamic sampling at a level >= 5 to ask the cbo to refine the estimate through sampling at parse time.

At 12:43 PM 10/17/2003, you wrote:
>Wolfgang,
>
>Thanks for the response. The problem I am seeing is slightly different.
>(I'll try to post some more detailed data, when I have the time). It's time
>to take a deep breath and be a bit clearer in my description.
>
>The issue arises when the PK of the parent is made up of more than 1 field.
>For example:
>
> CREATE TABLE Parent (a varchar2(1), b number primary key (a,b))
> INSERT INTO Parent VALUES ('A',1)
> INSERT INTO Parent VALUES ('B',2)
> INSERT INTO Parent VALUES ('C',3)
>
>Now create a table Child with a FK references Parent (a,b). Assume Child has
>10 rows, each Parent PK showing up at least once.
>
>OK, now NDVp(a)=NDVc(a)=3; NDVp(b)=NDVc(b)=3
>
>join cardinality = CARDp * CARDc * 1/max(NDVp(a),NDVc(a)) *
>1/max(NDVp(b),NDVc(b)) =
>= 3 * 10 /(3*3) ~ 3
>
>The actual cardinality will be 10. This is because we actually should be
>using NDV(a,b) not NDV(a)*NDV(b).
>
>Hope this is clearer. Again, I'll try to post some actual data as soon as I
>can.
>
>Henry
>
>
>-----Original Message-----
>Wolfgang Breitling
>Sent: Thursday, October 16, 2003 10:04 PM
>To: Multiple recipients of list ORACLE-L
>
>
>That's the problem with answering without thinking the answer completely
>through. My example below was only looking at the relationship from the
>child table side. If you look at it from the parent table side and add
>predicates - probably the more frequently used scenario - the problem
>becomes clear:
>
>Let's use the example below and narrow the resultset down to a single
>parent via its PK. As far as the optimizer is concerned that means a
>selectivity of 0.01 (=1%) and the estimated join cardinality is
> 0.01 * 100 * 1000 * 1/max(100,10) = 10
>
>However, since the one selected parent has either 0 or 100 children, the
>estimate is off. In the majority of cases, the query will be after one of
>the parents with children and the cardinality of the join will be 100, the
>optimizer having understimated by a factor of 10, just as Henry said. This
>underestimation is not because the optimizer does not recognize the
>parent-child relationship but rather a consequence of the violation of the
>CBO's "Join Uniformity Assumption" which states "that a row from one table
>is equally likely to join with any row from the second table"
>
>In Oracle 9 you can set dynamic sampling to a value >= 5 to get the cbo to
>probe the join cardinality at parse time. There is no such remedy in 8.1.7.
>In the example scenario I would consider altering the parent table
>statistics to counterbalance the violation of the join uniformity
>assumption. Of course, in a complex web of relationships this becomes a
>difficult, if not impossible balancing act. Or use hints to guide the cbo.
>
>At 04:19 PM 10/16/2003, you wrote:
> >Do you have concrete numbers. It''s been a while that I did my tests for
> >the paper and I would have to dig out my old testcases, but I was left
> >with the impression that the join cardinality "formula" was derived from
> >parent-child relationship joins and that the cardinality estimates are OK
> >for foreign key relationship joins.
> >for example let's say you have a parent table with 100 (distinct) parents
> >=> NDVp = 100; a child table with 1000 children but only 10 distinct
> >parent keys => NDVc = 10. Each child must have a parent, therefore you
> >have 90 childless parents and 10 parents with, on average, 100 children
> >each (you can choose different numbers, it comes out to the same). If you
> >join parent and child on the foreign key you should get all 1000 child
> >rows together with their parent data. The cardinality estimate by the CBO
> >would be:
> >
> > join cardinality = CARDp * CARDc * 1/max(NDVp, NDVc) = 100 * 1000 *
> > 1/max(100,10) = 100 * 1000 / 100 = 1000 (= rows(child) as you correctly
> > observed)
> >
> >The problems I found - and they are shown with the examples in the paper -
> >come when you do not have a parent-child relationship, either you have
> >rows in both tables which do not have a match in the joined table, or if
> >you have a many-to-many relationship.
> >
> >But maybe I am missing something ( wouldn't be the first time).
> >
> >At 02:19 PM 10/16/2003, you wrote:
> >>Is there any way to get the CBO (8.1.7) to recognize parent/child
> >>relationships? This seems to be an extreme example of the Join
>Independence
> >>Assumption Fallacy (or I guess maybe the Predicate Independence
>Assumption)
> >>discussed by Wolfgang Breitling in "Fallacies of the Cost Based
>Optimizer".
> >>If I am joining a parent and child table (parent PK [a,b]), Oracle assumes
>a
> >>resulting cardinality of :
> >>rows(parent)*rows(child)/{max(NDV[parent.a],NDV[child.a])*max(NDV[parent.b
>],
> >>NDV[child.b])}.
> >>
> >>This is assuming independence of field a and b. However, since they make
>up
> >>the PK of the parent, the denomenator is actually rows(parent) so the
>actual
> >>cardinality is rows(child). This causes the CBO to dramatically
> >>underestimate the join cardinality. If this is part of a multi-table join
> >>the entire execution plan can be thrown way off.
> >>
> >>I would assume that since the FK constraint is part of the data
>dictionary,
> >>Oracle would find a way to use this information. I checked by running
> >>autotraces with and without the FK, but the plan remained unchanged.
> >>
> >>Any suggestions?
> >>
> >>Thanks.
> >>
> >>Henry
> >>--
> >>Author: Henry Poras
> >> INET: hporas_at_etal.uri.edu
> >
> >Wolfgang Breitling
> >Oracle7, 8, 8i, 9i OCP DBA
> >Centrex Consulting Corporation
> >http://www.centrexcc.com
> >
> >--
> >Please see the official ORACLE-L FAQ: http://www.orafaq.net
> >--
> >Author: Wolfgang Breitling
> > INET: breitliw_at_centrexcc.com
> >
> >Fat City Network Services -- 858-538-5051 http://www.fatcity.com
> >San Diego, California -- Mailing list and web hosting services
> >---------------------------------------------------------------------
> >To REMOVE yourself from this mailing list, send an E-Mail message
> >to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
> >the message BODY, include a line containing: UNSUB ORACLE-L
> >(or the name of mailing list you want to be removed from). You may
> >also send the HELP command for other information (like subscribing).
>
>Wolfgang Breitling
>Centrex Consulting Corporation
>http://www.centrexcc.com
>
>--
>Please see the official ORACLE-L FAQ: http://www.orafaq.net
>--
>Author: Wolfgang Breitling
> INET: breitliw_at_centrexcc.com
>
>Fat City Network Services -- 858-538-5051 http://www.fatcity.com
>San Diego, California -- Mailing list and web hosting services
>---------------------------------------------------------------------
>To REMOVE yourself from this mailing list, send an E-Mail message
>to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
>the message BODY, include a line containing: UNSUB ORACLE-L
>(or the name of mailing list you want to be removed from). You may
>also send the HELP command for other information (like subscribing).
>
>--
>Please see the official ORACLE-L FAQ: http://www.orafaq.net
>--
>Author: Henry Poras
> INET: hporas_at_etal.uri.edu
>
>Fat City Network Services -- 858-538-5051 http://www.fatcity.com
>San Diego, California -- Mailing list and web hosting services
>---------------------------------------------------------------------
>To REMOVE yourself from this mailing list, send an E-Mail message
>to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
>the message BODY, include a line containing: UNSUB ORACLE-L
>(or the name of mailing list you want to be removed from). You may
>also send the HELP command for other information (like subscribing).

Wolfgang Breitling
Oracle7, 8, 8i, 9i OCP DBA
Centrex Consulting Corporation
http://www.centrexcc.com

-- 
Please see the official ORACLE-L FAQ: http://www.orafaq.net
-- 
Author: Wolfgang Breitling
  INET: breitliw_at_centrexcc.com

Fat City Network Services    -- 858-538-5051 http://www.fatcity.com
San Diego, California        -- Mailing list and web hosting services
---------------------------------------------------------------------
To REMOVE yourself from this mailing list, send an E-Mail message
to: ListGuru_at_fatcity.com (note EXACT spelling of 'ListGuru') and in
the message BODY, include a line containing: UNSUB ORACLE-L
(or the name of mailing list you want to be removed from).  You may
also send the HELP command for other information (like subscribing).
Received on Fri Oct 17 2003 - 15:19:25 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US