Re: Poor Performance of undo space management in 10.2 - bug info

From: Daniel Fink <daniel.fink_at_optimaldba.com>
Date: Fri, 15 Feb 2008 11:50:42 -0700
Message-ID: <47B5DF02.4040200@optimaldba.com>


These are actually problems that date back to 9i. The first problem we encountered was the incorrect status of the extent. The second was the excessive growth of the undo tablespace as new undo segments were created in excess of actual transaction load (eventually crashing the instance).

We found (and could reproduce) a situation where extents containing just committed (less than 1 minute with a 60 minute undo_retention) transactions were marked as expired. Other extents containing data committed over 24 hours previous were still marked as unexpired. We never received an adequate explanation about the behavior.

The second issue was hard to reproduce as it only occurred on large oltp systems with very high transaction load (thousands of transactions per second, though only about 2 thousand concurrent transactions) and was very sporadic. Segments were created when other segments were actually usable (no active transactions in them). Eventually, thousands of undo segments were created and once created an undo segment is *never* dropped (a real weakness in automatic undo). This was the result of SMON managing the segments and how it tracked used/usable segments.

All in all, automatic undo (from 9.2 on) is pretty good and works well for 99% of the systems out there. Like anything, you do need to be careful when working with a high volume system. If you want to learn more about automatic undo (9i...I never updated the research nor paper for 10g), you can download my paper on Automatic Undo Internals at http://www.optimaldba.com/papers/AutomaticUndoInternals.pdf

Regards,
Daniel Fink

-- 
Daniel Fink

Oracle Performance, Diagnosis and Training

OptimalDBA    http://www.optimaldba.com
Oracle Blog   http://optimaldba.blogspot.com


John Hallas wrote:

>
> This note is a heads-u on a bug which has caused me some problems over
> the last few days and yet is easily identifiable and resolvable (once
> you recognize the issue).
>
>
>
>
>
> We have spent the last 2 days trying to run a benchmark to prove end
> to end performance of a new code set. We have been plagued by undo
> tablespace problems which had the following symptons :-
>
>
>
> * Rapid growth of used undo tablespace
> * Serious deterioration in performance as the undo tablespace got
> very full
> * Loss of application connectivity as responses were not received
> in time (trading system with about 7 servers being used to hold
> components of the system)
>
>
>
> The complex set up of the test rig tended to mask the database aspect
> on each run and it was only this morning that we really focused on the
> undo tablespace.
>
>
>
> We had AUM set and a 60 second retention period and various sized t/s
> using both Ramsan and local disk.
>
>
>
> The problem was identified as undo extents remaining marked as
> unexpired well past the retention period, despite all connections
> being terminated and no active transactions running.
>
> Searching on Metalink showed Note 5387030.1 which refers to a bug with the TUNED_UINDORETENTION setting. This can be seen in v$undostat and once we had run alter system set "_smu_debug_mode" = 33554432; the v$undostat.tuned_undoretention statistic dropped from 345600 to 2188 and performance improved with unexpired undo segments hardly rising despite heavy throughput.
>
> This bug is common through 10.2.01 to 10.2.0.3 or is fixed in 10.2.0.4 or V11.
>
> John
>
>
> +44 (0)113 223 2274 (direct)
>
> +44 (0)113 297 9797
>
>
>
> ------------------------------------------------------------------------
>
> The information included in this email and any files transmitted with
> it may contain information that is confidential and it must not be
> used by, or its contents or attachments copied or disclosed, to
> persons other than the intended addressee. If you have received this
> email in error, please notify BJSS.
> In the absence of written agreement to the contrary BJSS' relevant
> standard terms of contract for any work to be undertaken will apply.
> Please carry out virus or such other checks as you consider
> appropriate in respect of this email. BJSS do not accept
> responsibility for any adverse effect upon your system or data in
> relation to this email or any files transmitted with it.
> BJSS Limited, a company registered in England and Wales (Company
> Number 2777575), VAT Registration Number 613295452, Registered Office
> Address, First Floor, Coronet House, Queen Street, Leeds, LS1 2TW
>
> ------------------------------------------------------------------------
>
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.5.516 / Virus Database: 269.20.4/1277 - Release Date: 2/13/2008 8:00 PM
>
-- http://www.freelists.org/webpage/oracle-l
Received on Fri Feb 15 2008 - 12:50:42 CST

Original text of this message