Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Join cardinality

Re: Join cardinality

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Fri, 7 Apr 2006 10:37:53 +0100
Message-ID: <bbadnZSYMOfur6vZRVnytQ@bt.com>

You have a number of problems to deal with, some of your own making, some relating to the way the optimizer works.

You comment in your second post:

> start_date, end_date and load_date
> are actually numeric equivalent of
> dates derived using some logic.

This will be a disaster for the optimizer, if the logic conceals the meaning of the data. The only logic that doesn't cause a problem is the logic that represents all your dates as the number of days since a fixed date.

One particular optimizer issue though, is the effect of transitive closure. Use dbms_xplan() to see the filter predicates on your data, and you will (almost certainly) see that the predicate

> where cal.country_id in ( '1', '2', '3' )

is cloned to apply to the customer table (i.e. it appears twice) and the resulting filter is then used to reduce the expected num_distinct on the join column (i.e. double-counting the effect) - hence your drastic reduction in cardinality.

One workaround to this is to put the join predicate in place as:

    and cust.country_id = cal.country_id + 0

Bear in mind that when the data sizes grow, that the optimizer may start to use a feature known as subquery pruning to determine exactly the statistics of the driving (calendar) table before optimising the query - so a test of two months may not be a fair test of behaviour of a full sized data set.

Finally - I would reconsider your partiitoning strategy. You have a LIST of country codes, but you are depending on range partitioning to separate the countries. A strategy of

    Range on the latest/history value,
    list subpartition
may be much more appropriate.

Potentially better still - you might consider using a partitioned view: Set up two separate tables with a UNION ALL view with 'instead of triggers'. The two tables could still be list partitioned by country.

Depending on circumstancs, you may also do a lot better with two local bitmapped indexes on the start and end dates, rather than one global concatenated index.

-- 
Regards

Jonathan Lewis
http://www.oracle.com/technology/community/oracle_ace/ace1.html#lewis

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

Cost Based Oracle: Fundamentals
http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html


"Pratap" <pratap_fin_at_rediffmail.com> wrote in message 
news:1144329007.026765.63560_at_e56g2000cwe.googlegroups.com...

> Oracle 9.2.0.5
>
> Table 1 - Calendar with following columns
> country_id
> load_date
> date_desc
>
> Table 2 - Customer dimension (SCD type 2)
>
> customer_id
> from_date
> to_date
> country_id
> latest_flag
>
> Customer table is partitioned on country_id and sub partitioned on
> latest_flag that is 0 and 1 (Latest and history)
>
> I fire queries like this -
>
> select *
> from customer cust,
> calendar cal
> where cal.country_id in ( '1', '2', '3' ) -- Literals
> and cal.date_desc = 'Last_Quarter'
> and cal.load_date between cust.from_date and cust.to_date
> --and cal.country_id = cust.country_id
>
> Now the problem is -
>
> 1. The cardinality calculated by Oracle is incorrect. It shows only 60
> odd rows coming from the customer table when in fact there could be
> 25,000. The access to customer table is done via a composite and
> compressed B-tree on from_date and to_date.
>
> 2. When I uncomment the country_id join, the cardinality reduces
> drastically which should not happen.
>
> There are around 30 distinct values for countries and the data
> distribution is not even.
>
> I gather statistics for granularity => 'ALL'. Histograms are gathered
> on country_id and all indexed columns.
>
> How can I ensure that cardinality is properly calculated by Oracle. As
> such the above query won't create a problem. But when I add 2-3 more
> tables then the effect of incorrect cardinality can be seen in
> sub-optimal execution plans.
>
> Regards,
> Pratap
>
Received on Fri Apr 07 2006 - 04:37:53 CDT

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US