Re: Looking for ideas on blocked sessions updating partitioned table with CLOB (solved)

From: Chris Stephens <cstephens16_at_gmail.com>
Date: Tue, 18 Sep 2018 09:35:21 -0500
Message-ID: <CAEFL0sy=qZn5K2Qp=c2Z07pCMZjs70HxFZj3Egfcn8qPfO-XUw_at_mail.gmail.com>



great follow-up chris!

On Tue, Sep 18, 2018 at 9:24 AM Chris Taylor < christopherdtaylor1994_at_gmail.com> wrote:

> So, I need to own up to something.
>
> My whole approach to this problem was wrong. I had been told that the
> problem with this table was the CLOB and the number of sessions updating
> the table at the same time. So, my whole approach was based on that
> information being TRUE.
>
> I was blinded to the 'real' problem and wasn't actively looking for an
> alternative answer until I posted this thread. Due to the discussion here,
> I took a few steps back and actually asked myself "why would this be row
> locks/blocks?" and WHY don't I see internal locking on the clob if the clob
> is the problem?
>
> At that point I was in 'question everything' mode and started really
> looking at the Event that the blocking session was sitting in (instead of
> just assuming it was a problem with design).
>
> The blocking sessions were ALWAYS sitting in EVENT: SQL*Net MORE DATA FROM
> CLIENT and WAIT_CLASS: NETWORK and the wait times were reaching upward of 2
> minutes before the wait_time_micro restarted counting.
>
> When I saw that (and really looked at it), I was like that makes no
> sense. Why would a session be getting latency from the PHP server? (Why
> is it taking so long to send an object from the PHP server to the database?)
>
> Running strace on the sessions on both the db server and the php server
> showed the sessions just sitting in either read() or write() to the network
> socket and never completing.
>
> Large file transfers INTO the main db server would stall. (Transfers OUT
> of the db server were fine) - this affected SQLNet, SCP, ncat etc.
> Outgoing was fine, Inbound was terrible.
>
> Ultimately we discovered these 2 settings had been DISABLED on this box at
> some point (as its been around a while):
>
> /proc/sys/net/ipv4/tcp_sack
>
> /proc/sys/net/ipv4/tcp_timestamps
>
> Both of those were "0" whereas all our other db servers those were "1" and
> weren't experiencing any problem.
>
> After setting both of those to "1" , our sessions now complete and no
> longer block.
>
> I know some of you may be thinking "How could he miss all the network
> waits?" - And that's a good question. All I can say is sometimes you get
> "blinded" by assumptions and aren't really seeing the problem for what it
> is.
>
> I've known this a long, long time. So, remember kiddos you may have to
> question 'what you know versus what you think you know' ;)
>
> Also, the very fact of sharing this problem with the you on the list gave
> me the ability to step back and look at it with a new mindset.
>
> Anyway, I wanted to share that as a testament (and a huge thank you) to
> the community we have here.
>
> Thanks,
> Chris
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Aug 31, 2018 at 11:48 AM Jonathan Lewis <
> jonathan_at_jlcomp.demon.co.uk> wrote:
>
>>
>> If the event is "enq TX: row lock wait" the p1/p2/p3 values will be about
>> the TX lock in v$lock. p2 = id1, p3 = id2, and p1 will encode "TX" and the
>> lock mode.
>> Select p1raw (if its available) or to_char(p1,'XXXXXXXXXXXXXXXX') and
>> you'll get 00000000054580006 - or possibly a 4 on the end. If it's a mode 4
>> lock then that's a clue that it's not really about a table row and more
>> likely to be about an index/IOT or some internal anomaly.
>>
>> If you've got the ASH data then current_obj# may give you the object_id
>> of the object being accessed, but it's not entirely reliable.
>>
>> Regards
>> Jonathan Lewis
>>
>> ________________________________________
>> From: Stefan Knecht <knecht.stefan_at_gmail.com>
>> Sent: 31 August 2018 16:04:20
>> To: Jonathan Lewis
>> Cc: christopherdtaylor1994_at_gmail.com; ORACLE-L
>> Subject: Re: Looking for ideas on blocked sessions updating partitioned
>> table with CLOB
>>
>> You can also look at the p1/p2/p3 values of the sessions that are blocked
>> by the TX contention - and see which object the contention is on. If it's
>> the lobindex, that may lead to further clues about some of the
>> possibilities raised by Jonathan
>>
>> Stefan
>>
>>
>>
>> On Fri, Aug 31, 2018 at 8:55 PM, Jonathan Lewis <
>> jonathan_at_jlcomp.demon.co.uk<mailto:jonathan_at_jlcomp.demon.co.uk>> wrote:
>>
>> There is a major problem with basicfile LOBs (with similar, but not so
>> drastic symptoms appearing for securefile LOBs), so as a first strategic
>> step you should probably be planning to find a painless way to move to
>> securefile LOBs.
>>
>> I've written an entire system about a problem with basicfile LOBs when
>> you do inserts and deletes - but the problem is the same if you do lots of
>> updates because for a LOB an update is a delete followed by an insert (with
>> the old LOB left in the LOB segment rather than being copied to the UNDO
>> segment). The way LOBs handle deletes is that the LOB index is a two-part
>> index of which the first part indexes in time order the chunks of LOBs that
>> have been deleted and the second part indexes by lobid the current LOB
>> chunks.
>>
>> A problem appears when you try to insert a LOB and there isn't any free
>> space in the segment, but there is plenty of freeable space. Your session
>> will free up all the freeable space (deleting LOB index entries) as it
>> goes. This can take a very long time and while it's happening any other
>> session that want to insert a LOB will wait for your session to finish
>> freeing up the space.
>>
>> Unfortunately I've only see HW enqueues appearing as a result of this
>> action, I don't think I've seen TX enqueues (and I wouldn't really expect
>> to see the "enq TX - row lock contention" enqueue). THe series start at
>> this URL: http://jonathanlewis.wordpress.com/2017/01/26/basicfile-lobs/
>>
>> Different possibility - is the LOB declared with multiple freepools ?
>> This is the default and it means the LOBindex has only the two parts I
>> describe above. If you have N freepools then the index consistents of N
>> pairs of parts. This doesn't avoid the problem I've described above (your
>> session will try to free ALL the free space from ALL the pools if it needs
>> space), but having a single freepool may explain your observations. If I
>> insert a very large LOB PERHAPS you can't insert one until I finish insert
>> LOBIndex entries into my bit of index, in which case maybe you'd show a row
>> wait of some sort - though, again I'd expect a different wait to appear
>> (maybe buffer busy).
>>
>> Another thought - your LOBs enable in-row storage: how many of them are
>> short enough to fit in the row (a few, lots, most) and do they get updated
>> many times in situ, and are many of them likely to start short and then
>> grow a few times before getting too long ? How long are the rest of the
>> columns in the row ? I'm thinking about the possibility that your smaller
>> LOBs spend some time growing in-row and causing row migration before they
>> get long enough to move into the LOB segment - and row migraion does funny
>> things with ITLs. Maybe something odd happens as a row that's migrated
>> moves its LOB to the LOB segment and the row migrates back to it's original
>> block.
>>
>> ---
>>
>> In the absence of answers - a plan for the future might be to range
>> partition by sess_id and hash subpartiiton by sess_id: I'm assuming that
>> sess_id is a value that's going to increase with time when I say that. If
>> that is the case then the benefit of doing this composite partitioning is
>> that on a regular basis you get a new partition and all the action moves
>> into it and any garbage that's happened in the earlier partition can be
>> cleared up (or dropped).
>>
>> If you stick with basicfiles then look at freepools just in case it's
>> relevant - securefile lobs automatically have a better strategy for
>> concurrency but there is a parameter you can set of increase concurrency.
>> (Can't remember which it is at present).
>>
>> Since I've mentioned the series - one feature of the mechanism is that
>> the LOB segment can become much larger than it needs to be, so if you can
>> sum() the sizes of the lobs that exceed the in-linesize and it's much
>> smaller than segment size you'll know that your update mechanism is
>> introducing some sort of problem.
>>
>> Regards
>> Jonathan Lewis
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Regards
>> Jonathan Lewis
>>
>>
>>
>>
>> ________________________________________
>> From: oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org>
>> <oracle-l-bounce_at_freelists.org<mailto:oracle-l-bounce_at_freelists.org>> on
>> behalf of Chris Taylor <christopherdtaylor1994_at_gmail.com<mailto:
>> christopherdtaylor1994_at_gmail.com>>
>> Sent: 31 August 2018 14:23:46
>> To: knecht.stefan_at_gmail.com<mailto:knecht.stefan_at_gmail.com>
>> Cc: ORACLE-L
>> Subject: Re: Looking for ideas on blocked sessions updating partitioned
>> table with CLOB
>>
>> Ah, very good questions.
>>
>> For the blocking issue:
>> 1. Blocking Type = enq: TX - row lock contention, yet the SESS_IDs (PK)
>> is different.
>> 2. Basicfile LOBs (I assume these were carried over from an upgrade from
>> a prev version)
>>
>> For the PHP object transfer being slow issue:
>> 1. TNS settings:
>> db server:
>> sqlnet.ora
>> ---------------------------------
>> send_buf_size=2097152
>> recv_buf_size=2097152
>> default_sdu_size=32768
>> default_tdu_size=32768
>>
>> tnsnames.ora
>> -----------------------------
>> Nothing on the tnsnames.ora
>>
>> web/app server:
>> tnsnames.ora
>> -------------------------------
>> Nothing specified here either for SDU etc
>>
>> 2. OS = Red hat Linux 6.8 64-bit , kernel 2.6.32-642.el6.x86_64
>>
>> 3. Block Size = 8192
>>
>> Thanks,
>> Chris
>>
>>
>> On Thu, Aug 30, 2018 at 10:02 PM Stefan Knecht <knecht.stefan_at_gmail.com
>> <mailto:knecht.stefan_at_gmail.com><mailto:knecht.stefan_at_gmail.com<mailto:
>> knecht.stefan_at_gmail.com>>> wrote:
>> When you say "blocking" - what event are the blocked sessions waiting on?
>>
>> Also, what's your TNS config - particularly SDU sizes between the
>> mid-tier and the database (TNS connection string and the receiving
>> listener)?
>>
>> Are you using securefile or basicfile LOBs?
>>
>> What OS is the database on and what block size are you using in the
>> tablespace where the lobs are stored?
>>
>> On Fri, Aug 31, 2018 at 6:21 AM, Chris Taylor <
>> christopherdtaylor1994_at_gmail.com<mailto:christopherdtaylor1994_at_gmail.com
>> ><mailto:christopherdtaylor1994_at_gmail.com<mailto:
>> christopherdtaylor1994_at_gmail.com>>> wrote:
>> Env: 12.1.0.2
>>
>> We have a table that stores session data (base64 encoded) from web
>> sessions.
>> The table has 32 HASH partitions and contains a CLOB.
>>
>> The table is hash partitioned by SESSION_ID which is a 64-char string.
>>
>> We have this code that executed in PHP into the Oracle database:
>> UPDATE sessions_table
>> SET session_data = :session_data // (CLOB)
>> WHERE session_id = :session_id;
>>
>> For 95% of the sessions this runs very,very fast.
>>
>> For 5% of the sessions that LOB being passed from PHP is 15MB and for
>> some reason PHP is really slow about transmitting those session OBJECTS
>> over to the database. (Verified through strace etc).
>>
>> While those sessions are waiting to complete the update to the CLOB, they
>> end up BLOCKING other sessions that aren't trying to update the same
>> session id but are in the same partition.
>>
>> I believe they're blocked on the CLOB on the same partition that the long
>> running session has open as I've confirmed that the SESS_ID being updated
>> is different in most cases.
>>
>> So, here's my question, what performance strategy should I be
>> investigating for CLOBS. Should I just add more partitions and spread out
>> the likelihood that a session will end up in the same partition? Seems
>> logical but I'm not sure how CLOBS play into this.
>>
>> The CLOBS are stored like this:
>> Segment Name = SYS_LOB0051640773C00002$$
>> Index Name = SYS_LOB0051640773C00002$$
>> Chunk = 8192
>> PCTVERSION = 10
>> CACHE = YES
>> LOGGING = None
>> Encrypt = None
>> Compression = None
>> IN_ROW = Yes
>> Partition = Yes
>> Retention = Yes
>>
>> Looking for any ideas on how to prevent a session from blocking other
>> non-related sessions that are updating the CLOB.
>>
>> I'm trying to mitigate the blocking at the db layer while we stand up a
>> product such as Redis to handle the session caching at the server level
>> (and remove it from the db).
>>
>> Chris
>>
>>
>>
>>
>>
>>
>>
>> --
>> //
>> zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
>> Visit us at zztat.net<http://zztat.net><http://zztat.net/> |
>> _at_zztat_oracle | fb.me/zztat<http://fb.me/zztat><http://fb.me/zztat> |
>> zztat.net/blog/<http://zztat.net/blog/><http://zztat.net/blog/>
>>
>>
>>
>> --
>> //
>> zztat - The Next-Gen Oracle Performance Monitoring and Reaction Framework!
>> Visit us at zztat.net<http://zztat.net/> | _at_zztat_oracle | fb.me/zztat<
>> http://fb.me/zztat> | zztat.net/blog/<http://zztat.net/blog/>
>>
>

--
http://www.freelists.org/webpage/oracle-l
Received on Tue Sep 18 2018 - 16:35:21 CEST

Original text of this message