Re: Performance issue on query doing smart scan
Date: Mon, 1 Feb 2021 16:39:07 +0530
Message-ID: <CAKna9VZ7xG5HRc_qwVce_j63eac=M=B+JG1uodoCfFpcWqjOUw_at_mail.gmail.com>
On Fri, Jan 29, 2021 at 4:21 PM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:
> *Actually the partition column I_DT is on TAB2 which is part of another
> with clause query but full scanning that table is not the problem here as
> we see from the sql monitor. It's the scanning of TAB1 which is partitioned
> on PART_DT is causing the issue. And we do have an explicit filter on this
> column for partition pruning, but anyway our requirement is to scan across
> all the ~180 partitions only.*
>
> Given that you have a table TAB1, which you've included in a CTE that
> you've aliased TAB1, which then gets an alias of "a" in the main query;
> with an equally bad appalli1ng n1aming convention involving TAB2 it's not
> entirely surprising that I managed to cross over the two names when working
> through your problem. Do you really have such bad coding standards?
>
> The point remains, however you said that you scanned all the partitions,
> and that the partitioning column was part_dt, and the SQL you sent us
> originally has the predicate "AND a.part_dt BETWEEN :1 AND :2".
> BETWEEN is a very bad option for a predicate on a partition key.A
> partition contains data that is strictly less than the upper boundary, and
> greater than or equal to the upper bound of the previous partitions, so the
> ideally range-based predicate should be of the form:
> part_dt >= {date1} and part_dt < {date2}
> If you do this than Oracle may be able to discard the predicate (i.e. not
> waste CPU testing it) if it can deduce that every row in every relevant
> partition is going to satisfy the predicate. This is particularly
> interesting for EXADATA with compressed tables, since it MAY mean (and I
> can't check it) that Oracle won't have to decompress the column to check
> the value, and this MIGHT have a noticeable benefit for some of the
> compression options in terms of CPU.
>
>
> Index compression and table compression have nothing to do with each
> other. Also the "much longer running" of the query when forced through
> (AANUM, TXID) isn't about the large number of distinct values of TXID, it's
> about the relatively small number of distinct values of AANUM or, to be
> more accurate, the large number of rows for each value of AANUM that are
> now being acquired by single block access.
>
> Your thoughts on compression do make sense as a general strategy for
> reducing storage space and increasing the probability of caching index leaf
> blocks. Since you have 173 million distinct value for AANUM in a table with
> 56 Billion rows then on average each value of AANUM has about 330 related
> rows and if this table were not partitioned you would automatically say
> that this is a good index for compression. BUT this is a local index on a
> table with 180 partitions, so each partition MIGHT only have one or two
> index entries for each value of AANUM in which case you probably wouldn't
> want to compress it; on the other hand perhaps the AANUM has values which
> correlate with time, so maybe any one value of AANUM appears in only about
> 10 partitions, in which case each partition would average 33 rows per AANUM
> and the index would be worth compressing.
>
> Regarding index columns:
>
>
>
> * WHERE a.aanum = stage.anum AND a.did = stage.did
> AND a.d_ind = 'Y' AND a.o_id LIKE CASE WHEN
> stage.al <http://stage.al> = 'XXXX' THEN stage.o_id ELSE a.o_id END
> AND a.part_dt BETWEEN :1 AND :2*
>
>
>
>
>
>
> * WHERE TAB1.anum(+) = stage.anum AND TAB1.did(+) =
> stage.did AND TAB2.anum(+) = stage.anum AND TAB2.did(+) =
> stage.did AND TAB1.pnm(+) = stage.pnm AND TAB2.pnm(+) =
> stage.pnm*
>
> I made two mistakes with my advice on (anum, did).
>
> First, I forgot that the final where clause was against CTEs, so I missed
> the need to have d_ind and o_id in the index if you wanted to avoid
> visiting the table unnecessarily; secondly I missed the TAB1.pnm predicate
> that was in the middle of your TAB2 predicates in the final where clause -
> you'd need to have this in the index as well to avoid visiting the table.
> Again there's a detail to check in your coding standards - the three TAB1
> predicates should have been together to avoid the risk of that mistake
> happening. (Just to add a positive note, I approve of the "next_table =
> current_table" arrangement of the predicates, some people would have
> written "stage.anum = TAB1.anum(+)")
>
> Two more thought on indexing - a predicate like d_ind='Y' suggests the
> column is a yes/no column.
> a) If almost all the data has d_ind='Y' then there may be no significant
> benefit in having that column in the index. If only a tiny fraction of
> the data has d_ind='Y' then if you can modify the query, you might be able
> to create a very small function-based index that would make this query much
> more efficient while adding very little to the workload.
> b) The list of relevant columns for the index has gone up to 5 for the
> "perfect" index - in no particular order: aanum, did, d_ind, o_id, pnm.
> If you know your data you may be able to decide that a combination of 4 or
> 3 of these columns is "good enough" to reduce the workload dramatically at
> query time while not adding too great a maintenance overhead.
>
> Finally a generic warning:
> *a.o_id LIKE CASE WHEN stage.al <http://stage.al> = 'XXXX' THEN
> stage.o_id ELSE a.o_id END *
>
> This is a variant of a far commoner construct which is erroneously written
> as the model for "give me everything unless the user requests specific
> rows", viz:
> *where colX = nvl(:bind_variable, colX)*
> (See: https://jonathanlewis.wordpress.com/2007/01/09/conditional-sql/ )
>
> The "LIKE" is sufficiently different that it's possible the predicate is
> doing exactly what it's supposed to do, but it's worth mentioning that if
> o_id is NULL then the predicate evaluates to false (even when stage.al =
> 'XXXX') and this may not be the intention.
>
>
> Regards
> Jonathan Lewis
>
>
>
>
-- http://www.freelists.org/webpage/oracle-lReceived on Mon Feb 01 2021 - 12:09:07 CET
- text/plain attachment: sql_monitor_Comparison.txt
- application/vnd.openxmlformats-officedocument.spreadsheetml.sheet attachment: Column_Stats.xlsx