RE: Huge DBF sequential reads by KTSJ while updating securefile LOBS.

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Fri, 1 Apr 2022 16:16:27 -0400
Message-ID: <061e01d84605$5ec1ce50$1c456af0$_at_rsiz.com>



but this does not show as a wait at the time the IO is issued.  

because it doesn’t know how long the wait is yet. IF the session has things it can do while the IO request is blocked it perhaps might, but eventually when the write completes it shows the wait. I have always wondered why, when the block frees, they don’t write the beginning and end of the block times from which you can compute the elapsed wait (or all three). In theory with parallelism you can proceed with other things that don’t depend on the write you are blocked on, but I’m not sure what Oracle does since I have not seen any of their source code since circa 1986, and that was just debugging their include files to get the sqlca the correct length for the OCI. I think everything was serial blocking back then. It was table locking for insert, update, and delete, so parallelism wouldn’t have done much.  

If asynchronous IO IS being used it just continues to the point where it needs something that depends on what it pretended was a completed write.  

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Goti Sent: Friday, April 01, 2022 11:11 AM
To: Jonathan Lewis
Cc: ORACLE-L
Subject: Re: Huge DBF sequential reads by KTSJ while updating securefile LOBS.  

Furthermore I am trying to understand the below snippet from MOS note, specially the one which is highlighted in bold and italics. Can someone help me understand what does this mean 🤔

WAITEVENT: "direct path write" Reference Note (Doc ID 50416.1)  

"If asynchronous IO is not being used then the IO write request blocks until completed but this does not show as a wait at the time the IO is issued. The session returns later to pick up the completed IO data but can then show a wait on "direct path write" even though this wait will return immediately."      

Thanks,  

Goti    

On Fri, Apr 1, 2022 at 8:34 PM Goti <aryan.goti_at_gmail.com> wrote:

Hi,  

I have updated the p1,p2,p3 details from ASH in the below gist as well.  

https://gist.github.com/aryangoti/e5d9360d1c1f0aa6224d4c1d27a5cfe8  

Thanks  

Goti    

On Fri, Apr 1, 2022 at 6:11 PM Goti <aryan.goti_at_gmail.com> wrote:

Hi Jonathan,  

I have attached the p1 ,p2, p3 values which the KTSJ module was doing the single block reads, and all are pointing to the LOB segment. The main issue here is goldengate replication latency. This we are observing when multiple concurrent sessions are updating the securefile LOB's. So to isolate the issue and to see whether the space management slaves are contributing to this behaviour i turned off that feature using the below parameter.  

ALTER SYSTEM SET "_ENABLE_SPACE_PREALLOCATION" = 0;   Post which i don't see single block reads from KTSJ module but still we had replication latency. We use synchronous IO for the NAS file system where the data resides. It appears the "direct path write" itself is taking less time however it appears the module is waiting for CPU (wait event null as per ASH).    

SQL_ID      EVENT                                                                            MODULE            COUNT(*)
------------- ---------------------------------------------------------------- ------------------------------------ ----------
                                                                                         OGG-R3_896B-OPEN_            160  <<<<<<

                                                                                             DATA_SOURCE                                                     
dw4d2s82txm97 direct path write                                    OGG-R3_896B-OPEN_            46

                                                                                           DATA_SOURCE                    
snapper details  

https://gist.github.com/aryangoti/588cc507d222930a6f906683c8115f3b  

The replication lag just keeps increasing and it just comes down once we stop the application.    

Thanks,  

Goti    

On Fri, Apr 1, 2022 at 4:19 PM Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:  

I was reading the pre-alloc time as time spent by the space management slaves allocating space from the tablespace to the LOB segment, not time spent extending the files in the tablespace. But that might be a simultaneous issue. I think the information you get from the ASH data might help identify that - the blocks might be the start of file blocks, or near the start of LOB segment blocks - or possibly near the start of newer extents in the LOB segment blocks.  

Regards

Jonathan Lewis    

On Fri, 1 Apr 2022 at 02:29, Goti <aryan.goti_at_gmail.com> wrote:

Thanks Jonathan for the details.  

I will check ASH for further details. Additionally , I did try to add more space to the respective LOB table space and we do have enough free space now. But one thing I am not getting like why SMCO still need to watch out for extending the space ?  

On Thu, 31 Mar 2022 at 10:31 PM, Jonathan Lewis <jlewisoracle_at_gmail.com> wrote:  

It's not your version, of course, but I was looking at a similar issue on 19c a little while ago:

https://jonathanlewis.wordpress.com/2022/01/10/control-file-waits/  

Some work of this type has to be done by some process - it's just a question of how visible it becomes to end-user run-time  

Regards

Jonathan Lewis    

On Thu, 31 Mar 2022 at 12:35, Goti <aryan.goti_at_gmail.com> wrote:

Hi All,  

Envt :11.2.0.4 on RHEL 6.5  

We are observing slowness in OGG replication on the target side. The replication itself is updating a table which has a secure file LOB. There are a lot of updates performed during this time (Close to 25K in an hour). The FG session waits for "direct path write" as it was created with the NOCACHE attribute. However, the ASH shows that this is internally triggering some space management slaves to do many single block reads. There are no trace files also associated with the W00* processes.  

Please find the TKPROF report for the OGG replicat OS process.

https://gist.github.com/aryangoti/f49660a4bbb23c58a9f403ac9270eb7a  

Snapper for W0003 process.

https://gist.github.com/aryangoti/f2cf46ebcec3920d79f4fb719c01f309  

ASH details (5 minutes) when the replication was slow.

https://gist.github.com/aryangoti/6674691d7770b6eb667718589633aec5  

Please let me know how to troubleshoot this further.,  

Thanks,  

Goti

-- 

Thanks,

 

Goti





--
http://www.freelists.org/webpage/oracle-l
Received on Fri Apr 01 2022 - 22:16:27 CEST

Original text of this message