Re: Performance issue on query doing smart scan
Date: Tue, 2 Feb 2021 20:32:15 +0530
Message-ID: <CAKna9Vb7YHuQocbcp_0AZLQdMwzPdOXNQGs1Vg0tE6xnPsqE4A_at_mail.gmail.com>
Also Jonathan, apart from having the new index created(which will really need a lot of tests/agreement because this transaction table is 7TB in size and is exposed to heavy batch inserts happening each day 24/7, ~250-300million rows each day) to help cater this query, as you earlier pointing to the double compression issue , should we get this checked with Oracle and we may get some fixes so that even with the current smart scan path the cell node resources can be minimized which will then not impact the flash IO much?
*I'd have to check one of the Exadata books (Kerry Osborne et. al.) for your version of Oracle on Exadata, but I think the following stats highlight the symptom:*
*cell CUs sent
compressed
866,940.00*
*cell physical IO bytes sent directly to DB node to balance CPU
15,739,502,952.00*
*I'm not sure how Oracle is counting (or possibly double counting) because I would have expected query high CUs to be close to 32KB each so sending 866,940 of them to the database server for decompression should have meant something like 26GB sent to DB node; however there is some consistency visible in other numbers viz:*
- EHCC Turbo Scan CUs Decompressed 1,212,123.00 =*
- cell CUs sent compressed 866,940.00 + cell CUs sent uncompressed 345,183.00*
*Regards*
*Lok*
On Mon, Feb 1, 2021 at 4:39 PM Lok P <loknath.73_at_gmail.com> wrote:
>
> Thank You Jonathan.
>
> I have modified the table names before posting the query. The actual
> table/alias name are different. But yes I should have named those better as
> it's a bit confusing.
>
> In regards to the below point for "Between VS >=, <" Operator, I am not
> able to fully understand though as input partition key values are bind
> values in both the cases , so Oracle has to get the exact value in runtime
> from the input binds. I tried executing a sample query on same table by
> doing "AND a.part_dt BETWEEN :1 AND :2" and "AND a.part_dt >= :1
> AND a.part_dt<:2" and i am not seeing much difference in elapsed time/CPU
> though. Attached is the sql monitor from both the executions.
>
> *"If you do this then Oracle may be able to discard the predicate (i.e.
> not waste CPU testing it) if it can deduce that every row in every relevant
> partition is going to satisfy the predicate. This is particularly
> interesting for EXADATA with compressed tables, since it MAY mean (and I
> can't check it) that Oracle won't have to decompress the column to check
> the value, and this MIGHT have a noticeable benefit for some of the
> compression options in terms of CPU*."
>
> In regards to the compression, I am seeing the global stats showing the
> distinct values for AANUM as ~173million which may not be fully accurate
> considering we are not using incremental stats collection so NDV
> calculation might go south, but i tried manually seeing the distinct values
> for a few partitions for column AANUM and those are repeated in all of the
> partitions. So basically we have AVG ~250million rows in each partition and
> in those ~135million distinct aanum values , which comes around ~2 index
> entries for each AANUM and in that prospect the compression of the local
> index (AANUM,TXID) won't be helpful. And also the same set of AANUM are
> repeated in each of the partitions. Attached is the columns stats for 4
> sample partitions.
>
> And as I see from sample four partitions, out of the columns used as
> filters in the query the column O_ID seems quite unique , but the way it's
> used in the query(as a case statement in the right hand side) is not
> filtering much rows. And as you rightly said, the column d_ind is having
> two distinct values 'Y' and 'N' but it's having ~40% and 60% rows
> respectively distributed. So the index on this column is not going to help
> much. I am planning to first try an index on AANUM,DID and see the
> performance. Else possibly last option is to have a big index comprising
> all the columns ( aanum, did, d_ind, o_id, pnm. ) to avoid table read
> fully, but again considering this table is a big transactional table we may
> have to think twice before taking decision on creating new indexes.
>
> Regards
>
> Lok
>
> On Fri, Jan 29, 2021 at 4:21 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
> wrote:
>
>> *Actually the partition column I_DT is on TAB2 which is part of another
>> with clause query but full scanning that table is not the problem here as
>> we see from the sql monitor. It's the scanning of TAB1 which is partitioned
>> on PART_DT is causing the issue. And we do have an explicit filter on this
>> column for partition pruning, but anyway our requirement is to scan across
>> all the ~180 partitions only.*
>>
>> Given that you have a table TAB1, which you've included in a CTE that
>> you've aliased TAB1, which then gets an alias of "a" in the main query;
>> with an equally bad appalli1ng n1aming convention involving TAB2 it's not
>> entirely surprising that I managed to cross over the two names when working
>> through your problem. Do you really have such bad coding standards?
>>
>> The point remains, however you said that you scanned all the partitions,
>> and that the partitioning column was part_dt, and the SQL you sent us
>> originally has the predicate "AND a.part_dt BETWEEN :1 AND :2".
>> BETWEEN is a very bad option for a predicate on a partition key.A
>> partition contains data that is strictly less than the upper boundary, and
>> greater than or equal to the upper bound of the previous partitions, so the
>> ideally range-based predicate should be of the form:
>> part_dt >= {date1} and part_dt < {date2}
>> If you do this than Oracle may be able to discard the predicate (i.e. not
>> waste CPU testing it) if it can deduce that every row in every relevant
>> partition is going to satisfy the predicate. This is particularly
>> interesting for EXADATA with compressed tables, since it MAY mean (and I
>> can't check it) that Oracle won't have to decompress the column to check
>> the value, and this MIGHT have a noticeable benefit for some of the
>> compression options in terms of CPU.
>>
>>
>> Index compression and table compression have nothing to do with each
>> other. Also the "much longer running" of the query when forced through
>> (AANUM, TXID) isn't about the large number of distinct values of TXID, it's
>> about the relatively small number of distinct values of AANUM or, to be
>> more accurate, the large number of rows for each value of AANUM that are
>> now being acquired by single block access.
>>
>> Your thoughts on compression do make sense as a general strategy for
>> reducing storage space and increasing the probability of caching index leaf
>> blocks. Since you have 173 million distinct value for AANUM in a table with
>> 56 Billion rows then on average each value of AANUM has about 330 related
>> rows and if this table were not partitioned you would automatically say
>> that this is a good index for compression. BUT this is a local index on a
>> table with 180 partitions, so each partition MIGHT only have one or two
>> index entries for each value of AANUM in which case you probably wouldn't
>> want to compress it; on the other hand perhaps the AANUM has values which
>> correlate with time, so maybe any one value of AANUM appears in only about
>> 10 partitions, in which case each partition would average 33 rows per AANUM
>> and the index would be worth compressing.
>>
>> Regarding index columns:
>>
>>
>>
>> * WHERE a.aanum = stage.anum AND a.did = stage.did
>> AND a.d_ind = 'Y' AND a.o_id LIKE CASE WHEN
>> stage.al <http://stage.al> = 'XXXX' THEN stage.o_id ELSE a.o_id END
>> AND a.part_dt BETWEEN :1 AND :2*
>>
>>
>>
>>
>>
>>
>> * WHERE TAB1.anum(+) = stage.anum AND TAB1.did(+) =
>> stage.did AND TAB2.anum(+) = stage.anum AND TAB2.did(+) =
>> stage.did AND TAB1.pnm(+) = stage.pnm AND TAB2.pnm(+) =
>> stage.pnm*
>>
>> I made two mistakes with my advice on (anum, did).
>>
>> First, I forgot that the final where clause was against CTEs, so I missed
>> the need to have d_ind and o_id in the index if you wanted to avoid
>> visiting the table unnecessarily; secondly I missed the TAB1.pnm predicate
>> that was in the middle of your TAB2 predicates in the final where clause -
>> you'd need to have this in the index as well to avoid visiting the table.
>> Again there's a detail to check in your coding standards - the three TAB1
>> predicates should have been together to avoid the risk of that mistake
>> happening. (Just to add a positive note, I approve of the "next_table =
>> current_table" arrangement of the predicates, some people would have
>> written "stage.anum = TAB1.anum(+)")
>>
>> Two more thought on indexing - a predicate like d_ind='Y' suggests the
>> column is a yes/no column.
>> a) If almost all the data has d_ind='Y' then there may be no significant
>> benefit in having that column in the index. If only a tiny fraction of
>> the data has d_ind='Y' then if you can modify the query, you might be able
>> to create a very small function-based index that would make this query much
>> more efficient while adding very little to the workload.
>> b) The list of relevant columns for the index has gone up to 5 for the
>> "perfect" index - in no particular order: aanum, did, d_ind, o_id, pnm.
>> If you know your data you may be able to decide that a combination of 4 or
>> 3 of these columns is "good enough" to reduce the workload dramatically at
>> query time while not adding too great a maintenance overhead.
>>
>> Finally a generic warning:
>> *a.o_id LIKE CASE WHEN stage.al <http://stage.al> = 'XXXX' THEN
>> stage.o_id ELSE a.o_id END *
>>
>> This is a variant of a far commoner construct which is erroneously
>> written as the model for "give me everything unless the user requests
>> specific rows", viz:
>> *where colX = nvl(:bind_variable, colX)*
>> (See: https://jonathanlewis.wordpress.com/2007/01/09/conditional-sql/ )
>>
>> The "LIKE" is sufficiently different that it's possible the predicate is
>> doing exactly what it's supposed to do, but it's worth mentioning that if
>> o_id is NULL then the predicate evaluates to false (even when stage.al =
>> 'XXXX') and this may not be the intention.
>>
>>
>> Regards
>> Jonathan Lewis
>>
>>
>>
>>
-- http://www.freelists.org/webpage/oracle-lReceived on Tue Feb 02 2021 - 16:02:15 CET
