Re: Optimizer estimated cardinality very low

From: Jonathan Lewis <jlewisoracle_at_gmail.com>
Date: Wed, 28 Oct 2020 12:40:28 +0000
Message-ID: <CAGtsp8mHLiEGWDPaQCd2JQNfC_JENv88UrpVdUayng9DJyZfTg_at_mail.gmail.com>



I've modelled this problem on a pair of non-partitioned tables. (Since the join is exactly across an exact matching pair of partitions I think this is probably valid).
I've got 100,000 rows in claim_header and 110,000 rows in claim_line. The actual number of distinct trnsct_cntl_nbr in both tables is exactly 100,000, with 4 dates
The basic stats collection for tables reports 4 dates and 99,848 distinct values for trnsctl_ctl_nbr.
Any set of tests below starts with no indexes, no column groups and no histograms

  1. Baseline test - this gets a "suitable" cardinality estimate (100K) a') create a histogram on either (or both) date columns - still gets the "proper" cardinality
  2. Adding indexes on (trnsctl_cntl_nbr, clm_hdr_pd_dt) at both ends I get cardinality 1. b') Adding hdr_sts_cd to either of the indexes gets "proper" cardinality b'') With 2-column indexes adding a histogram on date to either table gets cardinality 1
  3. Add a column group to just one table, ndv = 98,832, in both cases proper cardinality c') Add a column group to BOTH tables at once get cardinality 1 c'') Add histogram on date to ONE table (when both have column groups) get cardinality 1 c''') Add histograms on date to BOTH tables (when both have column groups) get proper cardinality c'''') As for c''', but increase the NDV (in stages) on the claim_header column groups - get cardinality 1 at 99,001 c''''') As for c''', but increase the NDV (in stages) on the claim_line column groups - get cardinality 1 at 108,901

From c''' and c''''' it appears that there's a break point for the sanity check at 99% of the rows in the table.
Trying to untangle this to get something useful is going to be a pain - and it's still not a complete set of tests.

Question - For the two tables, for the two columns in the join/existence, what indexes, column groups and histograms do you have.

Regards
Jonathan Lewis

On Tue, 27 Oct 2020 at 13:05, Noveljic Nenad <nenad.noveljic_at_vontobel.com> wrote:

> Hi Daniel,
>
>
>
> *“NUM_DISTINCT for the column_group in July are lower than
> ORDR_RQST_NUMBER instead of being equal or higher.”*
>
>
>
> This is exactly the reason that sanity check kicks in.
>
>
>
>
>
> *“ This happens across other partitions but the estimation doesn't seem
> impacted by this alone.”*
>
>
>
> There is some tolerance range. Set num_distinct for SYS_STU4S$4GK1UDMC2M8$MI#W6QS_
> of the June partition to a sufficiently low value for sanity check to kick
> in. Then you should get sel = 1.0 and a “good” cardinality estimate for
> your case.
>
>
>
>
>
> “*but the estimation doesn't seem impacted by this alone.”*
>
>
>
> Histogram on ORDR_RQST_DATE is apparently another prerequisite for the
> sanity check.
>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Oct 28 2020 - 13:40:28 CET

Original text of this message