Re: Fixing Performance issue with less selective columns

From: Lok P <loknath.73_at_gmail.com>
Date: Thu, 26 Aug 2021 00:56:06 +0530
Message-ID: <CAKna9VafK+Oha=BiHBqdYZSZzR=+bFf6gKpNYcnp+MdqZqooCQ_at_mail.gmail.com>



Thank You Andy. Actually as I stated earlier, this query is executed in a few seconds only but it's the higher number of executions(thousands+) for different binds which add up to the overall long run time of the process. and also most of the runs result in zero rows only. I tried putting it/query in a loop for different binds and executed with parallel (i 4) and without parallel and it seems it's taking longer with parallel hints. It may be because of maintaining and aggregating those additional parallel slaves etc.

*"Is :b4:= 'A' representative of your typical query here?" *apology for the confusion, actually i was trying to camouflage the actual bind values and that made it confusing. So it's actually coming as 'D' i.e. having ~104million matching rows in that table. So indexing that won't help here too.

The column M_TXT, its VARCHAR2(100 byte) column and holding string values with many having spaces in it at start/last.So trim function seems necessary.This filter alone is making the result set to ZERO for this query and its having highest NUM_DISTINCT values too, so it seems good candidate here for index. And this filter With regards to substr(:B8,.50), I will confirm and try to get it removed as that doesn't seem to serve any purpose but was somehow is there in this legacy code. We should be able to simply remove that SUBSTR function without any issue. But irrespective of that SUBSTR function on the right hand side of the predicate, are you pointing towards creating a function based index on trim(M_TXT) at the left hand side to help this query? Actually , I saw in many other places , this column is utilized in the predicates section like *substr(trim(m_txt),2,20)*=decode(col1, 'YYY', substr('XXXXX',1,20)). So is it possible to just have one index to cater both scenarios or we can tweak the query someway so as to utilize the same index for all the queries on this string column?

On Thu, Aug 26, 2021 at 12:12 AM Andy Sayer <andysayer_at_gmail.com> wrote:

> Hi Lok,
>
> " If there exists any other way to make this query faster without
> creating any new index that would really be helpful."
> You can use parallelism and have the scan completed by more processes at
> once.
>
> Is :b4:= 'A' representative of your typical query here? The results
> you've shared suggest this will return 0 rows, however there aren't a lot
> of distinct values for DC_CODE and you don't have any statistics. If the
> argument is frequently used then an index and a frequency histogram on this
> column would be lucrative. It's a bit surprising you have so many
> histograms on this table but this column's statistics didn't appear.
>
> The filter on M_TXT also would provide decent selectivity, I would
> question whether the trim is really needed. The substr 0.5 is a huge red
> flag.
>
> Thanks,
> Andrew
>
>
>
>
>
> On Wed, 25 Aug 2021 at 19:14, Lothar Flatz <l.flatz_at_bluewin.ch> wrote:
>
>> This scenario cries for bitmap indexes.
>> Bitmap Indexes can deal with "not equal" as well as "is null".
>> The columns seems to be low cardinality too.
>> The only open question is how often these columns get updated. (
>> https://asktom.oracle.com/pls/apex/asktom.search?tag=bitmap-indexes-and-locking
>> ).
>>
>> Regards
>>
>> Lothar
>>
>> Am 25.08.2021 um 19:19 schrieb Mark W. Farnham:
>>
>> unfortunately you keep nearly all the rows of both MA_FLG and D_UNMTCH,
>> so this query is the opposite of those indexes being useful.
>>
>>
>>
>> IF you were looking for ‘Y’ instead of not ‘Y’ on either one it would be
>> extremely good. I didn’t see initially that these two columns are extremely
>> inclusive.
>>
>>
>>
>> I think Sayan was checking that in his query request. MA_FLG could reject
>> at most about 6 million rows, so that’s pretty worthless.
>>
>>
>>
>> *From:* Lok P [mailto:loknath.73_at_gmail.com <loknath.73_at_gmail.com>]
>> *Sent:* Wednesday, August 25, 2021 1:09 PM
>> *To:* Mark W. Farnham
>> *Cc:* Sayan Malakshinov; Oracle L
>> *Subject:* Re: Fixing Performance issue with less selective columns
>>
>>
>>
>> Thank You Mark.
>>
>> I may be wrong but in this situation I was unable to think of any other
>> way we could make this query faster , so I was thinking of creating a new
>> index. If there exists any other way to make this query faster without
>> creating any new index that would really be helpful.
>>
>>
>>
>> I am not able to get your point fully, If you can help me understand it a
>> bit more here please. Below is the data pattern for MA_FLG and D_UNMTCH.
>>
>>
>>
>> Thus , in this query condition " NVL (I.MA_FLG, 'N') <> 'Y' results in
>> ~105million and NVL (I.D_UNMTCH, 'N') <> 'Y' results in ~111million. So
>> how should I create index or modify code to make it the best access/filter
>> criteria so as to make the query faster?
>>
>>
>>
>>
>>
>> MA_FLG
>>
>> Count(*)
>>
>> N
>>
>> 105228656
>>
>> Y
>>
>> 6000938
>>
>>
>>
>> 643566
>>
>>
>>
>> D_UNMTCH
>>
>> Count(*)
>>
>> Y
>>
>> 13715
>>
>>
>>
>> 111859445
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Aug 25, 2021 at 8:00 PM Mark W. Farnham <mwf_at_rsiz.com> wrote:
>>
>> The other thing, for flag values like AND NVL (I.MA_FLG, 'N') <>
>> 'Y' AND NVL (I.D_UNMTCH, 'N') <> 'Y'
>>
>>
>>
>> if you’re thinking about adding an index, and even if you need a virtual
>> column to do this because you have too much code depending on values ‘N’
>> and ‘Y’, define the final status (the one where nearly all of them land) as
>> NULL, being the ones you are NOT interested in most of the time. In both
>> these cases it looks like ‘Y’ would then be NULL, so
>>
>>
>>
>> i.ma_flg_v is defined decoding Y to NULL and anything else to N and your
>> code becomes and i.ma_flg_v = ‘N’ and you deal with variability in
>> non-nulls that are not ‘Y’ on the original,
>>
>> or
>>
>> i.ma_flg_v decodes Y to NULL, NULL to ‘N’ and anything else unchanged and
>> your code becomes i.ma_flg_v is NOT NULL,
>>
>> or
>>
>> you make a functional index on i.ma_flg that does the equivalent.
>>
>>
>>
>> I can’t remember off the top of my head whether either way gives you a
>> real advantage over the other in stats collections and the CBO doing
>> something smart and that probably changed over the releases. That might be
>> in one of my papers.
>>
>>
>>
>> When you then index that column the nulls disappear, leaving you with a
>> very tiny index to prune your result set immediately to very small and you
>> can usually filter the rest fast without an index.
>>
>>
>>
>> Remember, ORACLE cannot assign a value to NULL in anything they do. But
>> YOU can.
>>
>>
>>
>> When this is appropriate, it is one of the neatest and easiest “magic
>> tricks” in the Oracle kit.
>>
>>
>>
>> Good luck,
>>
>>
>>
>> mwf
>>
>>
>>
>> *From:* oracle-l-bounce_at_freelists.org [mailto:
>> oracle-l-bounce_at_freelists.org] *On Behalf Of *Sayan Malakshinov
>> *Sent:* Wednesday, August 25, 2021 9:40 AM
>> *To:* Lok P
>> *Cc:* Oracle L
>> *Subject:* Re: Fixing Performance issue with less selective columns
>>
>>
>>
>> Hi Lok,
>>
>>
>>
>> > SUBSTR(:B8,0*.50*)
>>
>> Looks like this query should be analyzed and tested better.
>>
>> You haven't provided histograms and bind values statistics, so not
>> enough info to analyze it properly.
>>
>> For now it looks like "I.WOF_DATE IS NULL" is one of the most selective
>> predicates - it gives only 83154 nulls.
>>
>> In addition to histogram statistics(dba_tab_histograms) and most often
>> binds values, I would like also to see what does return this query:
>>
>> select
>> NVL(I.MA_FLG, 'N'),NVL(I.D_UNMTCH, 'N'),I.DC_CODE,count(*)
>> FROM PP_IN_TAB I
>> group by NVL(I.MA_FLG, 'N'),NVL(I.D_UNMTCH, 'N'),I.DC_CODE;
>>
>>
>>
>>
>>
>>
>>
>> On Wed, Aug 25, 2021 at 4:14 PM Lok P <loknath.73_at_gmail.com> wrote:
>>
>> Hello , This database has version 11.2.0.4 of Oracle. We have the below
>> query which is executed thousands of times. It's used in a plsql function
>> which in turn gets called from a procedure. And this procedure gets called
>> from java thousands of times. And I see from dba_hist_sqlstat , for most
>> of the runs this below query results in zero rows. We see from the active
>> session history for the overall process this query is consuming most
>> time/resources and making the process run longer. So wanted to understand
>> if we can make this individual query execution faster which would
>> eventually make the process faster?
>>
>> The base table- PP_IN_TAB is holding ~111million rows and is ~43GB in
>> size. Column PP_ID is the primary key here. The filter predicates used in
>> this query are below. Many of them were not very selective in nature. So I
>> am not able to conclude if any composite index is going to help us here.
>> Can you please guide me , what is the correct approach to tune this process
>> in such a scenario?
>>
>> Below is the column data pattern used as filter predicate in this query.
>> Most of these are less selective in nature.
>>
>>
>>
>> *TABLE_NAME*
>>
>> *COLUMN_NAME*
>>
>> *NUM_DISTINCT*
>>
>> *NUM_NULLS*
>>
>> PP_IN_TAB
>>
>> EF_ID
>>
>> 39515
>>
>> 6151686
>>
>> PP_IN_TAB
>>
>> PE
>>
>> 103074806
>>
>> 647050
>>
>> PP_IN_TAB
>>
>> PT_Code
>>
>> 24
>>
>> 0
>>
>> PP_IN_TAB
>>
>> PT_MCODE
>>
>> 20
>>
>> 0
>>
>> PP_IN_TAB
>>
>> D_CUR_CODE
>>
>> 13
>>
>> 592784
>>
>> PP_IN_TAB
>>
>> ED_AMT
>>
>> 320892
>>
>> 6
>>
>> PP_IN_TAB
>>
>> WOF_DATE
>>
>> 2572
>>
>> 83154
>>
>> PP_IN_TAB
>>
>> PR_CTGRY
>>
>> 2
>>
>> 86
>>
>> PP_IN_TAB
>>
>> PDE_RSN_CAT
>>
>> 6
>>
>> 0
>>
>> PP_IN_TAB
>>
>> MA_FLG
>>
>> 2
>>
>> 648172
>>
>> PP_IN_TAB
>>
>> M_TXT
>>
>> 29460248
>>
>> 9118572
>>
>> PP_IN_TAB
>>
>> D_UNMTCH
>>
>> 1
>>
>> 111766716
>>
>>
>>
>>
>>
>> SELECT NVL (I.PP_ID, 0)
>> FROM PP_IN_TAB I
>> WHERE TRIM(I.M_TXT) = TRIM (SUBSTR ( :B8, 0.50)) AND I.PT_Code
>> = :B7
>> AND NVL ( :B6, I.PT_MCODE) = NVL ( :B6, :B5) AND I.DC_CODE =
>> :B4
>> AND I.D_CUR_CODE = :B3 AND I.ED_AMT = :B2
>> AND I.PR_CTGRY = :B1 AND I.PE IS NOT NULL
>> AND I.EF_ID IS NULL AND I.WOF_DATE IS NULL
>> AND NVL (I.MA_FLG, 'N') <> 'Y' AND NVL (I.D_UNMTCH, 'N') <>
>> 'Y'
>> AND ROWNUM = 1;
>>
>>
>> Global Information
>> ------------------------------
>> Status : DONE (ALL ROWS)
>> Instance ID : 1
>> SQL Execution ID : 16777216
>> Execution Started : 08/25/2021 03:53:25
>> First Refresh Time : 08/25/2021 03:53:25
>> Last Refresh Time : 08/25/2021 03:53:28
>> Duration : 3s
>> Module/Action : SQL*Plus/-
>> Program : sqlplus.exe
>> Fetch Calls : 1
>>
>> Global Stats
>>
>> =========================================================================================
>> | Elapsed | Cpu | IO | Application | Fetch | Buffer | Read |
>> Read | Cell |
>> | Time(s) | Time(s) | Waits(s) | Waits(s) | Calls | Gets | Reqs |
>> Bytes | Offload |
>>
>> =========================================================================================
>> | 3.30 | 1.15 | 2.15 | 0.00 | 1 | 6M | 44379 |
>> 43GB | 99.99% |
>>
>> =========================================================================================
>>
>> SQL Plan Monitoring Details (Plan Hash Value=1096440065)
>>
>> ==========================================================================================================================================================================================
>> | Id | Operation | Name | Rows | Cost |
>> Time | Start | Execs | Rows | Read | Read | Cell | Mem |
>> Activity | Activity Detail |
>> | | | | (Estim) | |
>> Active(s) | Active | | (Actual) | Reqs | Bytes | Offload | (Max) |
>> (%) | (# samples) |
>>
>> ==========================================================================================================================================================================================
>> | 0 | SELECT STATEMENT | | | |
>> | | 1 | | | | | |
>> | |
>> | 1 | COUNT STOPKEY | | | |
>> | | 1 | | | | | |
>> | |
>> | 2 | TABLE ACCESS STORAGE FULL | PP_IN_TAB | 1 | 128K |
>> 3 | +2 | 1 | 0 | 44379 | 43GB | 99.99% | 6M |
>> 100.00 | cell smart table scan (3) |
>> ==========================================================================================================================================================================================
>>
>>
>> Predicate Information (identified by operation id):
>> ---------------------------------------------------
>>
>> 1 - filter(ROWNUM=1)
>> 2 - storage("I"."WOF_DATE" IS NULL AND "I"."EF_ID" IS NULL AND
>> "I"."PT_Code"=:B7 AND "I"."D_CUR_CODE"=:B3 AND "I"."PR_CTGRY"=:B1 AND
>> "I"."DC_CODE"=:B4 AND "I"."ED_AMT"=TO_NUMBER(:B2) AND
>> NVL(:B6,"I"."PT_MCODE")=NVL(:B6,:B5) AND
>> TRIM("I"."M_TXT")=TRIM(SUBSTR(:B8,0.50))
>> AND "I"."PE" IS NOT NULL AND NVL("I"."MA_FLG",'N')<>'Y' AND
>> NVL("I"."D_UNMTCH",'N')<>'Y')
>> filter("I"."WOF_DATE" IS NULL AND "I"."EF_ID" IS NULL AND
>> "I"."PT_Code"=:B7 AND "I"."D_CUR_CODE"=:B3 AND "I"."PR_CTGRY"=:B1 AND
>> "I"."DC_CODE"=:B4 AND "I"."ED_AMT"=TO_NUMBER(:B2) AND
>> NVL(:B6,"I"."PT_MCODE")=NVL(:B6,:B5) AND
>> TRIM("I"."M_TXT")=TRIM(SUBSTR(:B8,0.50))
>> AND "I"."PE" IS NOT NULL AND NVL("I"."MA_FLG",'N')<>'Y'
>> AND NVL("I"."D_UNMTCH",'N')<>'Y')
>>
>>
>>
>>
>> --
>>
>> Best regards,
>> Sayan Malakshinov
>>
>> Oracle performance tuning engineer
>>
>> Oracle ACE Associate
>> http://orasql.org
>>
>>
>>

--
http://www.freelists.org/webpage/oracle-l
Received on Wed Aug 25 2021 - 21:26:06 CEST

Original text of this message