RE: UNDO Space Error

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Sun, 21 Mar 2021 11:27:34 -0400
Message-ID: <091701d71e66$b8043560$280ca020$_at_rsiz.com>



To see if your conjecture is correct (that the undo is from the indexes), change the destination of the insert to an exactly matching table with no indexes.  

IF that works (and monitoring the job you should see that the UNDO required is nearly all space management), so a very large monolith size indeed should work on the direct insert into an index free table.  

THEN insert from this new table using a convenient sized batchsize known to not be too big into the final destination.  

IF there is a reasonable way to walk through sets of rows of a known working monolith size with respect to your select feeding the insert, just do that. (And there IS always a reasonable way unless the insert is mandatorily part of a “must commit together or fail together” transaction. IF that is the case you need to find a logical way to limit the size of the “batch” that complies with the business rules.)  

Search for Bryn Llewellyn (author) Doing SQL from PL/SQL and you’ll probably find useful examples together with efficiency evaluations of the various methods.  

mwf      

From: oracle-l-bounce_at_freelists.org [mailto:oracle-l-bounce_at_freelists.org] On Behalf Of Pap Sent: Sunday, March 21, 2021 1:16 AM
To: gogala.mladen_at_gmail.com
Cc: Oracle L
Subject: Re: UNDO Space Error    

Thank you.  

We do use batching for large DML to do it in bulk fashion. However, I got one of the INSERT queries which has failed and looks something as below. And I do see this table TAB(masked actual name) has 4 indexes and I think all the UNDO generated must be because of that direct path load is happening so table block related UNDO will be zero. And below is the sample insert query which has failed with Ora-30036.  

I do see the execution plan has changed from past but i can't relate howcome change in path can result in higher UNDO space consumption for the INSERT query. It should depend on the number of rows those are loaded to table only. Please correct me if my understanding is wrong. And also even if the plan suggests the expected rows are higher in the new plan which is pointing towards a higher number of rows being loaded to the table too, but i had verified with the application team and they mentioned the number of rows loaded were around the same as previous executions which got loaded during the failure days. So still wondering what must be the cause of this sudden failure?  

INSERT /*+ append parallel(16) nologging */ INTO USER1.TAB (c1, c2, c3, c4... c11) SELECT...;      

On Sun, Mar 21, 2021 at 6:25 AM Mladen Gogala <gogala.mladen_at_gmail.com> wrote:

As opposed to ORA-01555 which is a nightmare to debug and resolve, ORA-30036 doesn't present such problems because it's thrown by the same transaction that causes it. The cause is very simple: the current transaction ran out of space in the undo segment. At this point, I'd like to remind you of the common saying that disks are cheap, especially is someone else is buying them. So, what you need to do is to figure out which transaction is causing the problem. Second thing is figuring out what to do about that. I've frequently seen this with data purges. If you want to get rid of 1/3 of the rows in a billion rows table, based on a date coliumn, it is conceivable that you might need few KB of the UNDO space. The solution is usually to write a PL/SQL procedure which will clean the rows in batches of 100K.

So, what were you doing? Can you post the SQL which has thrown the error? Without the exact statement, cardinality of the tables involved and rough estimate how much data are you modifying, it's hard to tell you anything more.

On 3/20/21 2:22 PM, Pap wrote:

Hello Listers, Its version 11.2.0.4 of oracle. We suddenly started encountering ORA-30036(ORA-30036: unable to extend segment by 8 in undo tablespace 'UNDOTBS2') in one of the databases. We don't have any changes done to the code. We already increased the UNDO tablespace size from ~100Gb to ~190GB but still facing the same and this time we want to find the root cause rather than keep on increasing the size of tablespace.

During one of the failures I saw DBA_UNDO_EXTENTS was showing almost all of the extent status as UNEXPIRED. dba_free_space was showing zero space for that tablespace. We have the data files in the undo tablespace set as autoextend ON. And we are using AUTO undo management with UNDO retention set as 900.

I understand there are two types of UNDO noted by oracle , one is UNDO read which the SELECT query sometimes fails with Ora-01555 (but here we are not encountering that). The other one is UNDO generation because of the DML(INSERT/UPDATE/DELETE) and in this case our failure is because of the same UNDO and each time it's mostly INSERT queries failing while doing data load. So is there any way I can track the exact session/sql/user which is generating maximum UNDO from any historical AWR views and also during run time?

How to debug from history and get the cause of this sudden increase in UNDO space consumption? Any other fix other than increasing UNDO space?

Regards

pap    

--

Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com

--
http://www.freelists.org/webpage/oracle-l
Received on Sun Mar 21 2021 - 16:27:34 CET

Original text of this message