Re: Same query with no plan change or volume but runs for hours vs minutes

From: Maxim <mdemenko_at_gmail.com>
Date: Tue, 28 Nov 2023 16:36:47 +0100
Message-ID: <CAFP4yMzakkvRkxW8TL+e2wme9uk=VcWH7Yh8JrjHB2HTVyXjJg_at_mail.gmail.com>



Probably one of the first things i would do in such situation - doublecheck, whether the sql from the function -

select count(*) into dicount  from     pmtd td
         where td.did = pnum AND  td.cid  in (999999);

always runs with the same execution plan.

Regards

Maxim

On Tue, Nov 28, 2023 at 1:55 PM Lok P <loknath.73_at_gmail.com> wrote:

> Thank you for the response.
>
> My Apology. I forgot to mention its version 19C (19.19.0.0.0) of oracle on
> exadata.
>
> There exists only one composite (TXN_ID, WOD_DT) primary key index on this
> table-PBDRA_DTL. This table is daily range partitioned on a date column.
> And also the sequence -PBDRAD_SEQ does fed into the leading column(TXN_ID)
> of the PK index.
>
> The AVG_ROW_LEN i see in dba_tab_partition for last few of the partitions
> are showing ~198 and similar for others too.
>
> PARTITION_NAME NUM_ROWS BLOCKS AVG_ROW_LEN COMPRESSION
> PBDRA_DTL_11272023_P 33734616 1487844 197 DISABLED
> PBDRA_DTL_11262023_P 33469912 980070 198 DISABLED
> PBDRA_DTL_11252023_P 33527112 994892 198 DISABLED
> PBDRA_DTL_11242023_P 33948522 991827 198 DISABLED
> PBDRA_DTL_11232023_P 32667974 993492 198 DISABLED
> PBDRA_DTL_11222023_P 34653004 1529017 198 DISABLED
> PBDRA_DTL_11212023_P 33233838 968435 198 DISABLED
> PBDRA_DTL_11202023_P 32870086 1472631 198 DISABLED
> PBDRA_DTL_11192023_P 32485084 947193 198 DISABLED
>
> Select list doesnt have any plsql function.
>
> Today I have captured the sql monitor and related stats from v$sesstat for
> a ~5minutes delta period for a quick/good run in which the complete query
> finished in ~15minutes. I have posted the sql monitor and stats in same
> github link, The session stats i have posted as comments to the same github
> linc.
>
> https://gist.github.com/oraclelearner/722a77ef5ebce23396d82e2835958303
>
> When you mentioned "*some catastrophic activity on the base table might
> cause the ASSM related stats*", are you pointing to delete or anything?
> Because we normally dont performa DELETE but just use partition DROP for
> data purge on these tables. Below is the stats specific to ASSM i do
> observe , and those i have not noticed in past frequently. And also as we
> encounter this slow run intermittently, so not sure if we are encountering
> any bugs related to ASSM here?
>
> ***** BAD RUN******
>
> ASSM cbk:blocks examined 184273393
> ASSM cbk:blocks marked full 4610
> ASSM gsp:bump HWM 2
> ASSM gsp:get free block 46150
> ASSM gsp:good hint 36942
> ASSM gsp:L1 bitmaps examined 42601388
> ASSM gsp:L2 bitmap full 2
> ASSM gsp:L2 bitmaps examined 151006
> ASSM gsp:Optimized index block rejects 438092544
> ASSM gsp:Optimized reject DB 36692008
> ASSM gsp:Optimized reject l1 1150
> ASSM gsp:reject db 622315351
> ASSM gsp:reject L1 462672
> ASSM gsp:Search all 2
> ASSM gsp:Search hint 35
> ASSM gsp:Search steal 2
> blocks decrypted 41626
> branch node splits 8
> buffer is not pinned count 1424244
> buffer is pinned count 296
> calls to get snapshot scn: kcmgss 1429118
>
> ***** GOOD RUN*****
>
> ASSM cbk:blocks examined 12188141
> ASSM cbk:blocks marked full 214458
> ASSM gsp:bump HWM 37
> ASSM gsp:get free block 405327
> ASSM gsp:good hint 405308
> ASSM gsp:L1 bitmaps examined 21220340
> ASSM gsp:L2 bitmap full 64
> ASSM gsp:L2 bitmaps examined 9006
> ASSM gsp:L3 bitmaps examined 54
> ASSM gsp:Optimized reject DB 185267698
> ASSM gsp:Optimized reject l1 124657
> ASSM gsp:Optimized reject l2 27
> ASSM gsp:reject db 11568356
> ASSM gsp:reject L1 13730
> ASSM gsp:reject L2 27
> ASSM gsp:Search all 37
> ASSM gsp:Search hint 402
> ASSM gsp:Search steal 37
> blocks decrypted 125480
> branch node splits 77
> buffer is not pinned count 12752925
> buffer is pinned count 2643
> calls to get snapshot scn: kcmgss 12796284
>
> *******
>
> The tablespaces holding the table and index partitions are all TDE
> encrypted.
>
> The logic inside FUN1 looks something as below. It does execute sql inside
> it but we have not changed anything in this logic recently. The table pmtd
> having ~1.8million rows and having size ~497MB. Its using index for
> accessing that table inside this function. Dont see a change in input data
> pattern for this function in both the days.
>
> FUNCTION Fun1 (cnum IN VARCHAR2, pnum IN VARCHAR2)
> RETURN VARCHAR2
> IS
> ret VARCHAR2(20) := 'XX';
> dicount NUMBER := null;
> BEGIN
> IF (cnum = '1111')
> THEN
> ret := 'XX';
> ELSE IF (cnum is null)
> THEN
> select count(*) into dicount from pmtd td
> where td.did = pnum AND td.cid in (999999);
> IF(dicount > 0) THEN ret := 'Y';
> ELSE
> ret := 'N';
> END IF;
> END IF;
> END IF;
> RETURN ret;
> END;
>
>
> On Tue, 28 Nov, 2023, 4:58 pm Dominic Brooks, <dombrooks_at_hotmail.com>
> wrote:
>
>> It's all about the buffer gets.
>> I would think this is probably a side-effect of some other processing.
>>
>> 316M compared to 18G - only a few of which require actual physical reads.
>>
>> But the answer should be in the session statistics for the two respective
>> sessions.
>> 5 minutes sample stats is not a lot from 6.5 hours.
>> I would be looking at AWR, ASH and application logging to see if there's
>> any correlation to any other processing involving the same objects
>> beforehand some times.
>> Otherwise it's the full session stats for the session which is going to
>> tell you the WHY. Although statistics are only part of the answer. If you
>> could snap before/after execution of this SQL (from the code for example),
>> that should help you find out the high level category of what is being done.
>> An extended sql trace for this sql id would also help answer.
>> ------------------------------
>> *From:* oracle-l-bounce_at_freelists.org <oracle-l-bounce_at_freelists.org> on
>> behalf of Jonathan Lewis <jlewisoracle_at_gmail.com>
>> *Sent:* 28 November 2023 10:11
>> *To:* Oracle L <oracle-l_at_freelists.org>
>> *Subject:* Re: Same query with no plan change or volume but runs for
>> hours vs minutes
>>
>> Which version of Oracle?
>> What's the average row length of the rows inserted into the table?
>> How many indexes on the table - is it just one on the sequence-generated
>> column?
>> Does the "fun1()" function in the where clause execute any SQL?
>> Does the select list include any pl/sql functions that execute any SQL?
>>
>> There are two possible places suggested by what you've told us that might
>> be where the excess CPU is used.
>>
>> 1) In some PL/SQL function code that either executes very late in the
>> statement, or execute fairly early but gets reported in the wrong place
>>
>> 2) the amount of time Oracle spends check the ASSM bitmaps for free space
>> has exploded - possible because of some catastrophic activity in the table
>> that happened before this insert started.
>>
>> Regards
>> Jonathan Lewis
>>
>>
>>
>>
>>
>> On Mon, 27 Nov 2023 at 20:15, Lok P <loknath.73_at_gmail.com> wrote:
>>
>> Hi All,
>>
>> The insert query normally runs for <15minutes somedays runs for ~6-7hrs
>> with exactly the same plan and approx same volume. And all the resources
>> seem to be spent on step "LOAD TABLE CONVENTIONAL" and its "ON CPU".
>> Wondering where it's spending time during the long execution?
>>
>> While the long run was happening , we tried to fetch the stats from
>> v$sesstat for a ~5minutes duration and collected the non zero stats. Not
>> able to find out anything odd. Could you please guide me here, what must be
>> the cause of this sudden long execution on some days for this same query
>> with almost the same volume?
>>
>> I have posted the sql monitors from both the runs below and also in the
>> below github linc. And also published the nonzero output from the v$sesstat
>> for the long execution for a ~5 minutes delta duration.We do see some stats
>> like "KTFB alloc**" , is that normal or some issue is causing these
>> stats to pop up?
>>
>> https://gist.github.com/oraclelearner/722a77ef5ebce23396d82e2835958303
>>
>> ******** Sql monitor from long run ********
>> Global Information
>> ------------------------------
>> Status : EXECUTING
>> Instance ID : 1
>> SQL Execution ID : 16777216
>> Execution Started : 11/27/2023 03:30:16
>> First Refresh Time : 11/27/2023 03:30:19
>> Last Refresh Time : 11/27/2023 10:06:13
>> Duration : 23759s
>> PLSQL Entry Ids (Object/Subprogram) : 161298,9
>> PLSQL Current Ids (Object/Subprogram) : 161298,15
>>
>> Global Stats
>>
>> ===========================================================================================================================================================================
>> | Elapsed | Cpu | IO | Application | Concurrency | Cluster |
>> PL/SQL | Other | Buffer | Read | Read | Uncompressed | Offload |
>> Offload | Cell |
>> | Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Waits(s) |
>> Time(s) | Waits(s) | Gets | Reqs | Bytes | Bytes | Elig Bytes |
>> Returned Bytes | Offload |
>>
>> ===========================================================================================================================================================================
>> | 23758 | 23570 | 4.88 | 1.28 | 4.78 | 58 |
>> 41 | 119 | 18G | 1M | 719GB | 359GB | 715GB |
>> 4GB | 99.42% |
>>
>> ===========================================================================================================================================================================
>>
>> SQL Plan Monitoring Details (Plan Hash Value=557779869)
>>
>> ==================================================================================================================================================================================================================================
>> | Id | Operation | Name
>> | Rows | Cost | Time | Start | Execs | Rows | Read | Read |
>> Mem | Activity | Activity Detail |
>> Progress |
>> | | |
>> | (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes |
>> | (%) | (# samples) |
>> |
>>
>> ==================================================================================================================================================================================================================================
>> | -> 0 | INSERT STATEMENT |
>> | | | 23761 | +3 | 1 | 0 | | |
>> . | 0.16 | Cpu (35) |
>> |
>> | | |
>> | | | | | | | | |
>> | | cell smart table scan (2) |
>> |
>> | -> 1 | LOAD TABLE CONVENTIONAL | PBDRA_DTL
>> | | | 23762 | +2 | 1 | 0 | 499K | 4GB |
>> . | 98.42 | gc current block 2-way (4) |
>> |
>> | | |
>> | | | | | | | | |
>> | | gc current grant 2-way (23) |
>> |
>> | | |
>> | | | | | | | | |
>> | | gc current grant busy (18) |
>> |
>> | | |
>> | | | | | | | | |
>> | | latch: cache buffers chains (5) |
>> |
>> | | |
>> | | | | | | | | |
>> | | log file switch completion (1) |
>> |
>> | | |
>> | | | | | | | | |
>> | | undo segment extension (2) |
>> |
>> | | |
>> | | | | | | | | |
>> | | Cpu (22695) |
>> |
>> | | |
>> | | | | | | | | |
>> | | cell single block physical read: RDMA (8) |
>> |
>> | | |
>> | | | | | | | | |
>> | | cell single block physical read: xrmem cache (34) |
>> |
>> | | |
>> | | | | | | | | |
>> | | enq: FB - contention (1) |
>> |
>> | | |
>> | | | | | | | | |
>> | | ges message buffer allocation (2) |
>> |
>> | | |
>> | | | | | | | | |
>> | | cell single block physical read: flash cache (3) |
>> |
>> | -> 2 | SEQUENCE | PBDRA_SEQ
>> | | | 23761 | +3 | 1 | 18M | | |
>> . | 0.50 | Cpu (109) |
>> |
>> | | |
>> | | | | | | | | |
>> | | reliable message (6) |
>> |
>> | -> 3 | HASH JOIN |
>> | 68259 | 1M | 23761 | +3 | 1 | 18M | | |
>> 5MB | 0.02 | Cpu (5) |
>> |
>> | 4 | TABLE ACCESS STORAGE FULL | PRFS
>> | 78 | 2 | 1 | +3 | 1 | 63 | 1 | 40960 |
>> . | | |
>> |
>> | -> 5 | PARTITION RANGE ITERATOR |
>> | 152K | 1M | 23761 | +3 | 1 | 18M | | |
>> . | | |
>> |
>> | -> 6 | TABLE ACCESS STORAGE FULL | PBABR_DTL
>> | 152K | 1M | 23761 | +3 | 215 | 18M | 740K | 715GB |
>> 7MB | 0.33 | Cpu (73) |
>> 100% |
>> | | |
>> | | | | | | | | |
>> | | gcs drm freeze in enter server mode (1) |
>> |
>> | | |
>> | | | | | | | | |
>> | | reliable message (1) |
>> |
>> | | |
>> | | | | | | | | |
>> | | cell smart table scan (1) |
>> |
>>
>> ==================================================================================================================================================================================================================================
>>
>>
>> ********* Quick Run ****************
>>
>> Global Information
>> ------------------------------
>> Status : DONE
>> Instance ID : 1
>> SQL Execution ID : 16777216
>> Execution Started : 11/24/2023 04:39:04
>> First Refresh Time : 11/24/2023 04:39:07
>> Last Refresh Time : 11/24/2023 04:55:37
>> Duration : 993s
>> PLSQL Entry Ids (Object/Subprogram) : 161298,9
>> PLSQL Current Ids (Object/Subprogram) : 161298,15
>>
>> Global Stats
>>
>> ===========================================================================================================================================================================
>> | Elapsed | Cpu | IO | Application | Concurrency | Cluster |
>> PL/SQL | Other | Buffer | Read | Read | Uncompressed | Offload |
>> Offload | Cell |
>> | Time(s) | Time(s) | Waits(s) | Waits(s) | Waits(s) | Waits(s) |
>> Time(s) | Waits(s) | Gets | Reqs | Bytes | Bytes | Elig Bytes |
>> Returned Bytes | Offload |
>>
>> ===========================================================================================================================================================================
>> | 993 | 942 | 5.27 | 0.05 | 0.01 | 0.02 |
>> 68 | 46 | 316M | 2M | 867GB | 443GB | 861GB |
>> 7GB | 99.21% |
>>
>> ===========================================================================================================================================================================
>>
>> SQL Plan Monitoring Details (Plan Hash Value=557779869)
>>
>> =====================================================================================================================================================================================
>> | Id | Operation | Name |
>> Rows | Cost | Time | Start | Execs | Rows | Read | Read |
>> Mem | Activity | Activity Detail |
>> | | | |
>> (Estim) | | Active(s) | Active | | (Actual) | Reqs | Bytes |
>> (Max) | (%) | (# samples) |
>>
>> =====================================================================================================================================================================================
>> | 0 | INSERT STATEMENT | |
>> | | 991 | +3 | 1 | 0 | | |
>> . | | |
>> | 1 | LOAD TABLE CONVENTIONAL | PBDRA_DTL |
>> | | 991 | +3 | 1 | 0 | 874K | 7GB |
>> . | | |
>> | 2 | SEQUENCE | PBDRA_SEQ |
>> | | 992 | +2 | 1 | 30M | | |
>> . | | |
>> | 3 | HASH JOIN | |
>> 66851 | 1M | 991 | +3 | 1 | 30M | | |
>> 5MB | | |
>> | 4 | TABLE ACCESS STORAGE FULL | PRFS |
>> 78 | 2 | 1 | +3 | 1 | 63 | 2 | 49152 |
>> 1MB | | |
>> | 5 | PARTITION RANGE ITERATOR | |
>> 149K | 1M | 991 | +3 | 1 | 32M | | |
>> . | | |
>> | 6 | TABLE ACCESS STORAGE FULL | PBABR_DTL |
>> 149K | 1M | 991 | +3 | 286 | 32M | 892K | 861GB |
>> 7MB | | |
>>
>> =====================================================================================================================================================================================
>>
>>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Nov 28 2023 - 16:36:47 CET

Original text of this message