Oracle FAQ | Your Portal to the Oracle Knowledge Grid |
![]() |
![]() |
Home -> Community -> Usenet -> c.d.o.server -> Re: CBO rows estimate way out in 2 table join - what to do?
"joel garry" <joel-garry_at_home.com> wrote in message
news:1193959828.744083.306450_at_i38g2000prf.googlegroups.com...
wrote:
> "stephen O'D" <stephen.odonn..._at_gmail.com> wrote in message
>
> > SQL> explain plan for
> > SELECT
> > --count(*)
> > cpd.customer_ref
> > FROM custproductdetails cpd, ACCOUNT a
> > WHERE cpd.account_num = a.account_num
> > AND cpd.customer_ref = a.customer_ref
> > AND a.currency_code = 'GBP'
> > AND a.invoicing_co_id = 1 2 3 4 5 6 7 8
> > 9 ;
>
>
> begin
> dbms_stats.gather_table_stats({owner},'custproductdetails',
> cascade=>true, method_opt=>'for all columns size 1');
> dbms_stats.gather_table_stats({owner},'account', cascade=>true,
> method_opt=>'for all columns size 1');
>
> dbms_stats.gather_table_stats({owner},'account', method_opt=>'for
> columns currency_code, invoicing_code_id size 254');
>
> end;
>
This seems a bit strange to me, given the cascade true, 10K distributed skewed across 70M rows? Wouldn't it be better to use a larger percent on the table, no cascade, and separately completely analyze the index (and any others)? Yes it would take hours, but wouldn't it likely be worth it to get a better execution time? Am I missing an elephant in my blind flailing?
jg
Order of quote:
Steve's SQL
My suggested stats collection
Joel's question
Joel,
The first step of the execution plan is one that predicts
855 rows instead of nearly 10,000. We are told that
the data is skewed on the two filter columns.
So step one in finding a solution (since I'm not on site to run a few data sampling queries) is to generate stats that are typically appropriate for this sort of situation, viz:
compute for small table and its indexes small sample for large table and its indexes histograms on known skewed columns
What I want to see is if the histograms will make the first table in the plan report something closer to 10,000 rows.
If the join columns between the two then have a uniform distribution, the join cardinality should be much better if the driving cardinality is nearly correct.
STEP 2: examine the data distribution of the join columns and consider the need to generate histograms at both ends of the join - but you don't generate histograms unless you really need to. As part of STEP 2, I would also suggest reading Alberto Dell'Era's terrific document about "joins over histograms"
http://www.adellera.it/investigations/join_over_histograms/index.html
I've also referenced it from my blog:
http://jonathanlewis.wordpress.com/2007/04/25/histograms-and-joins/
-- Regards Jonathan Lewis http://jonathanlewis.wordpress.com Author: Cost Based Oracle: Fundamentals http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html The Co-operative Oracle Users' FAQ http://www.jlcomp.demon.co.uk/faq/ind_faq.htmlReceived on Fri Nov 02 2007 - 02:54:27 CDT
![]() |
![]() |