Re: Join cardinality

From: Jonathan Lewis <jonathan_at_jlcomp.demon.co.uk>
Date: Fri, 7 Apr 2006 15:10:47 +0100
Message-ID: <QOadnTVM7_L076vZnZ2dnUVZ8qOdnZ2d@bt.com>

"Pratap" <pratap_fin_at_rediffmail.com> wrote in message news:1144416358.498763.270570_at_i40g2000cwc.googlegroups.com...
>
>> The only logic that doesn't cause a problem
>> is the logic that represents all your dates as the
>> number of days since a fixed date.
>
> Yes the logic is like that. Only difference is that the start point is
> different for each country.
>

So you are NOT using the logic I described, I said:

ALL your dates, and A fixed date.

If your 'different start points for each country' are "very close" to each other your implementation may not cause many problems. "very close" in this context means that the number of days between any pair of start dates should be no more than about 10% of the smallest range of dates for a single country.

>
>> One workaround to this is to put the join predicate
>> in place as:
>> and cust.country_id = cal.country_id + 0
>
> I will try that
>
>> Potentially better still - you might consider using a
>> partitioned view: Set up two separate tables with a
>> UNION ALL view with 'instead of triggers'. The two
>> tables could still be list partitioned by country.
>>
>
> Isn't it the same as sub-partitioned tables (partition by country and
> subpartition by latest flag or the reverse way) with local indexes?

If it were the same, I would not have suggested that it might be better. Note particularly the reference to LIST partitioning - you have a LIST of countries, it may help the optimizer if you use structures that make this very clear. Your previous statement was that you were ranging on country.

> Actually there are reports that query the latest as well as the history
> data. That is the reason I have stuck to global indexes on one table
> only. Another reason for not using the partition views is that I have
> many dimensions (SCD type 2) exactly like customer table that are
> joined together. There can be as many as 4-5 such tables in a single
> query. I observed merge joins between the partitioned views in the
> (limited) test cases I executed and so shelved the idea.
>

You mean you are partitioning your dimensions on a 'latest/history' column ? If so, then joining multiple PVs together could be a little flakey.

Merge joins, though, suggest relatively small datasets. So you need to consider whether your testing is an adequate model of your final production system, or whether your final production system is so small that too fine a granularity of partitioning is going to cause a problem.

> The reason why I have created sub-partitions is that most of the
> reports run on latest data for one or many countries. So wanted that to
> be clustered in a partition.
>

Bear in mind that it's always possible to clone the data. Keep one table that is all the data, and another that it just the latest data. The maintenance overheads won't necessarily be much worse than the costs of moving data from one partition to the other.

-- 
Regards

Jonathan Lewis
http://www.oracle.com/technology/community/oracle_ace/ace1.html#lewis

The Co-operative Oracle Users' FAQ
http://www.jlcomp.demon.co.uk/faq/ind_faq.html

Cost Based Oracle: Fundamentals
http://www.jlcomp.demon.co.uk/cbo_book/ind_book.html

Received on Fri Apr 07 2006 - 09:10:47 CDT