Re: Improving query performance further

From: yudhi s <learnerdatabase99_at_gmail.com>
Date: Mon, 17 Oct 2022 00:44:06 +0530
Message-ID: <CAEzWdqfhphGfxceOrA707Jw-BKtqJVsOPFn2q5Cj7Qoggsa0Yw_at_mail.gmail.com>



Yes its truncate +load table with no Update and delete happening on it. We tried it with a sample table and similar data pattern on a lower environment. The bitmap index with pct_free "0" performance appears to be worse compared to the composite b-tree index on (part_col, min_val,max_val). Tried both by using index_join and without it also.

INDEX_TYPE LEAF_BLOCKS DISTINCT_KEYS CLUSTERING_FACTOR NUM_ROWS

NORMAL            1116                    150001                   4929
                         150001
BITMAP               754                     126350                 149999
                         149999
BITMAP               754                     126352
150001                           150001

Below is the sql monitor comparison of both the plans.

https://gist.github.com/databasetech0073/b576a301943ecf6f04bc2ab3249e316a

On Sun, Oct 16, 2022 at 6:08 PM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:

>
> Another idea that occurs to me following on from your comment about the
> daily truncate and reload.
>
> Do you update the min_val or max_val columns of the table once you've
> loaded the data, do you insert or delete rows. (Maybe you've answer the
> last question already, but I haven't gone back to re-read all the posts).
>
> If the key components of the data are effectively read-only as soon as
> you've loaded it then you could create two (local) bitmap indexes, one on
> (min_val) one on (max_val). Since you average about 2 rows per value the
> size of the indexes would be about the same whether they're B-tree
> (compreessed) or bitmap, but the bitmap might be a little smaller.
>
> With those indexes in place the reduction in block visits for range scans
> might be more than enough to offset the CPU time needed to handle the
> bitmap merges and OR's.
>
> (The indexes should also be built with pctfree zero if they don't change
> during the day),
>
> Regards
> Jonathan Lewis
>
>
>
>
>
>
> On Sun, 16 Oct 2022 at 10:41, yudhi s <learnerdatabase99_at_gmail.com> wrote:
>
>> Thank you so much for the suggestions.
>> Not sure how it will impact histograms but yes, as Lok mentioned, all the
>> data stored in this column min_val and max_val appears to be Number only
>> and the bind values i saw were also not holding any characters in them, but
>> the input bind coming from application to the query , is of data type
>> varchar2 and also the datatype of the columns are defined varchar2 in the
>> table. Also, checked the input bind values captured and the data pattern,
>> almost 60% of the input binds are closer to min value and ~40% are closer
>> to max value but again most of the binds are still resulting >50K rows in
>> either of those filters.
>>
>> And if i get it correct, individual histograms on these two columns
>> should suffice (e.g. 'for all columns size repeat, for columns size
>> 254/2048 min_val, max_val') rather than a column group histogram for the
>> optimizer. With regards to caching , as here the majority of io/time is
>> spent only while accessing the index partition only, so putting the table
>> in cache may not help much but index should. Please correct me if my
>> understanding is wrong. Also I was looking for a quick fix as to hint it
>> some way to cache the objects , but it looks like we have cache hints
>> available for tables but not for Indexes or index partitions. It has to be
>> put in "keep" cache which is not possible through any hints etc.
>>
>> On Sat, Oct 15, 2022 at 2:04 PM Jonathan Lewis <jlewisoracle_at_gmail.com>
>> wrote:
>>
>>>
>>> I did point out that the strategy would be fairly fragile, but there are
>>> a couple of points in its favour if the optimizer decides to use adaptive
>>> cursor sharing.
>>>
>>> a) This is 19c so the histogram could have 2,048 buckets (though the
>>> default of 254 might be sufficient anyway)
>>>
>>> b) When the low_value and high_value both look like numbers and the
>>> input is numeric the optimizer behaves as it's dealing with numbers and
>>> that will help with the arithmetic. I've got a note about that on my blog,
>>> I think, but can't find it at present.
>>>
>>> c) Histograms model skew - and in this case the skew is in the size of
>>> range for a given number of rows; histograms don't have to be about
>>> capturing "rows per value" (see:
>>> https://jonathanlewis.wordpress.com/2016/09/26/pk-histogram/ ). In this
>>> case I think the OP has indicated that the data is more dense round the
>>> extreme values and sparse in the middle.
>>>
>>>
>>> Regards
>>> Jonathan Lewis
>>>
>>>
>>> On Fri, 14 Oct 2022 at 22:33, Lok P <loknath.73_at_gmail.com> wrote:
>>>
>>>> Also Jonathan, as the OP mentioned there exists ~120K distinct values
>>>> in both the min and max columns and that to those are varchar2 type
>>>> holding numbers. So the histograms on each of those with max ~254 buckets
>>>> may not be able to give 100% correct estimation for these ranges prdicates
>>>> for all the binds and thus endup impacting the query response.
>>>>
>>>>
>>>>

--
http://www.freelists.org/webpage/oracle-l
Received on Sun Oct 16 2022 - 21:14:06 CEST

Original text of this message