Re: Who's the big transaction?

From: Joel Garry <joel-garry_at_home.com>
Date: 22 Nov 2002 16:27:20 -0800
Message-ID: <91884734.0211221627.718ace97@posting.google.com>

"Richard Foote" <richard.foote_at_bigpond.com> wrote in message news:<0GqD9.81802$g9.230477_at_newsfeeds.bigpond.com>...
> "Joel Garry" <joel-garry_at_home.com> wrote in message
> news:91884734.0211211543.533d0db4_at_posting.google.com...
> > Everyone:
> >
> > I've straightened out a bunch of serious rbs problems (and other
> > stuff, like insufficient memory and swap, some transactions that
> > needed to be broken up, etc.), now I'm getting into the more subtle
> > things.
> >
> > I decided to shrink the rbs's down to their initial 10M sizes (I had
> > changed a lot of other stuff, removed optimal, didn't want to mess too
> > much with this before things stabilized, intended to consider Mr.
> > Rogers excellent posts very seriously when I did adjust them), and see
> > what happens, figuring they'd all grow to some fairly constant size if
> > there is a repetitive large transaction. After a month+, this is how
> > things look:
> >
> > SQL> select usn, extents,hwmsize, shrinks, wraps, extends from
> > v$rollstat;
> >
> > USN EXTENTS HWMSIZE SHRINKS WRAPS EXTENDS
> > ---------- ---------- ---------- ---------- ---------- ----------
> > 0 9 745472 0 0 0
> > 26 5 10641408 0 9 0
> > 1 5 10641408 0 11 0
> > 2 8 17031168 0 15 0
> > 3 5 10641408 0 9 0
> > 4 13 27680768 0 8 0
> > 5 5 10641408 0 10 0
> > 6 6 12771328 0 7 0
> > 7 5 10641408 0 10 0
> > 8 45 95838208 0 57 40
> > 9 5 10641408 0 6 0
> > 10 16 34070528 0 16 0
> > 11 38 80928768 0 10 0
> > 12 8 17031168 0 6 0
> > 13 5 10641408 0 8 0
> > 14 5 10641408 0 11 0
> >
> > 16 rows selected.
> >
> > I was kind of expecting a smoother distribution across the rbs's. So
> > question number 1: Does Oracle consider the rbs size when allocating
> > to a transaction? I was under the distinct impression it did not. I
> > guess it is possible that I did a better job than I expected breaking
> > up the periodic huge updates... if the answer is "no," then Question
> > Number 2 becomes, how do you catch an infrequent huge transaction?
>
> Hi Joel,
>
> No, I don't believe Oracle considers the size of the rollback segment. The
> last time I tested this (which I was with 8.1.6) Oracle used a least busy
> algorithm. That is, it picks the RBS with the fewest concurrent
> transactions. If there is a tie in which is currently least busiest, Oracle
> then picks the RBS which has the next USN from the last RBS selected. This
> is easily tested by looking at the pattern of xidusn in v$transaction as
> transactions are added/removed.

Ahh, thanks. That "next USN" part was what led me to expect a smooth distribution, but what I'm seeing indicates very rare very large transactions.

>
> I'm not sure what to go is with 9i but I would be surprised if the algorithm
> was much different and Oracle doesn't spread it's load in a similar manner.
>
> With regard to catching large transactions, there's a couple of things to
> consider. Firstly, what do you mean by a large transaction because there are
> two types that can cause your RBS to grow.
>
> Firstly, there's the large transaction in terms of the amount of work that
> it's doing and the amount of undo that it generates. Obviously, if this
> transaction in combination with other transactions concurrently assigned to
> the RBS is more than it can contain, then yes the RBS will grow. Two options
> here. One, is to assign all your RBS to be big enough to cope with your
> largest transaction. That way, it doesn't matter which one it picks, it can
> cope and you won't incur possible performance issues of having it
> dynamically growing on you. The second option is to pre-create one big mamma
> of a RBS and use the SET TRANSACTION statement before the large transaction
> to preassign it to the large RBS. There are well documented pluses and
> minuses with both approaches.

I wasn't really sure which type until recently. I now believe it is the long-running transaction, (see the example above of usn 2) since it happened to crash me (and subsequently I discovered, I seemed to have crashed it, too, although trying to reconcile what the user says, what I saw, what I see in the logs, and what I see in the v$'s seems impossible), and there are bigger segment usages at other times. It is possible there have been crashes at those times too, which is why I want to find out when those times are.

The set transaction has a big minus of "too much code with it's transaction setting buried in a proprietary language."

>
> Then there's the large transaction in terms of the time it takes for the
> transaction to actually complete. It doesn't necessarily need to do much
> work, but if such a "blocking transaction" doesn't commit/rollback within
> the time it takes Oracle to wiz around the extents of the RBS, Oracle will
> be forced to grow the RBS. The solution here is obviously to commit/rollback
> as quickly as possible after the logical completion of the transaction.
> Ad-hoc update of a row followed by a 2 hour lunch, followed by a boring
> management meeting followed by the commit is not recommended.
>
> >
> > usn 11 and usn 4 grew to their size within a couple of days of the
> > shrinkage. usn 8 did very little until a few days ago.
>
> It only take one large transaction of either type described above to cause
> these RBSs to grow.
>
> >
> > There had been no ORA-1555's until this morning, when I was running a
> > huge report (usn 2 is the one that blew up...). I hadn't been
> > worrying too much about huge reports since I hope to be putting a
> > reporting instance on another box. But until that actually happens...
> >
> > I am writing a cron that would periodically query v$rollstat, and if
> > it sees a change, run the script from metalink Note:1039126.6 to see
> > who's on what rollback, the idea being a big transaction should be
> > around for at least a few minutes...
> >
> > Anyone have another solution? I'm guessing
> > http://www.ixora.com.au/scripts/rbs.htm#rollback_reuse_time and such
> > are not useful, since the undo is so inconsistent, although I haven't
> > tried it. I see this as the converse problem.
>
> I'm not entirely sure what the problem is that you are trying to resolve. Is
> it that you want all the RBS to be of the same size ?

Well, I'm trying to catch the unusual situations where it requires the large RBS. Because I want to have more RBS's (that should lower my wraps, and reduce contention, right?), so I don't want to size so many of them so big unless I have to. I'm pretty sure most of the problem comes from a particular program, so I can fix that by breaking up the transactions (I haven't done so because it is "fixed in the next release," which I may be upgrading to soon anyways). It's just I want to catch the things that are not commonly run or are newly implemented and have this problem. It could be at 4AM or 7PM (or 7/24 sometime...), I need to automate to catch the other problem programs. Otherwise I just get vague complaints from users way after the fact.

Note I also am expecting some increase in users over the next year, so that is why I'm ignoring the current lack of contention according to waits/gets/undos.

>
> If so, there are two general solutions. One like I said before to size them
> all to be the same large size (in your example all around 100M) that can
> then cope with these large transactions. That way, it doesn't matter which
> RBS is chosen, any will cope. Beforehand, I would check your waits/gets
> ratio and determine the appropriate *number* of RBSs. Advantages, you can

Well, remember tftsmstat? :-)

ROLLBACK CONTENTION STATISTICS: GETS - # of gets on the rollback segment header WAITS - # of waits for the rollback segment header

NAME                                WAITS       GETS
------------------------------ ---------- ----------
SYSTEM                                  0      1,870
RBS02                                   0     13,788
RBS03                                   0     16,215
RBS04                                   0     14,647
RBS05                                   0     12,028
RBS06                                   1     14,527
RBS07                                   0     12,455
RBS08                                   0     13,526
RBS09                                   0     29,588
RBS10                                   0     11,876
RBS11                                   0     15,034
RBS12                                   0     13,760
RBS13                                   0     12,194
RBS14                                   0     12,850
RBS15                                   0     13,240
RBS01                                   0     13,912

16 rows selected.

The average of waits/gets is 0%

If the ratio of waits to gets is more than 1% or 2%, consider creating more rollback segments

Another way to gauge rollback contention is:

CLASS                   COUNT
------------------ ----------
system undo header          0
system undo block           0
undo header                 4
undo block                  8

Total requests = 220480

old   2:        (round(count/(&xxv1+0.00000000001),4)) * 100||'%'
new   2:        (round(count/(    220480+0.00000000001),4)) * 100||'%'

Contention for system undo header = 0%

old   2:        (round(count/(&xxv1+0.00000000001),4)) * 100||'%'
new   2:        (round(count/(    220480+0.00000000001),4)) * 100||'%'

Contention for system undo block = 0%

old   2:        (round(count/(&xxv1+0.00000000001),4)) * 100||'%'
new   2:        (round(count/(    220480+0.00000000001),4)) * 100||'%'

Contention for undo header = 0%

old   2:        (round(count/(&xxv1+0.00000000001),4)) * 100||'%'
new   2:        (round(count/(    220480+0.00000000001),4)) * 100||'%'

Contention for undo block = 0%

If the percentage for an area is more than 1% or 2%, consider creating more rollback segments. Note: This value is usually very small
and has been rounded to 4 places.

I suppose now you want to know which name is which usn :-)

NAME                                  USN
------------------------------ ----------
SYSTEM                                  0
RBS02                                   1
RBS03                                   2
RBS04                                   3
RBS05                                   4
RBS06                                   5
RBS07                                   6
RBS08                                   7
RBS09                                   8
RBS10                                   9
RBS11                                  10
RBS12                                  11
RBS13                                  12
RBS14                                  13
RBS15                                  14
RBS01                                  26

> stop worrying about your RBSs growing dynamically (to a point), you can also
> stop worrying about 1555s (to a point). Disadvantages, you will require a
> substantial amount of disk, you will possibly induce a fair amount of extra
> I/O while Oracle writes all these RBS blocks to disk.
>
> Or solution two which is to set an appropriate optimal setting (in your
> example around 10M) and make Oracle automatically shrink back any RBS that
> has a large transaction/volume run against it. That way, all your RBS stay
> at approximately the same size. Check the number of growths/shrinks to make
> sure that 10M is not too small, you don't want an excessive number here.
> Advantages, you save a substantial amount of storage, you potentially reduce
> the amount of I/O associated with RBS (check v$filestat to confirm).

Way less than I expected:

SQL> select * from v$filestat where file#=3;

FILE# PHYRDS PHYWRTS PHYBLKRD PHYBLKWRT READTIM WRITETIM
---------- ---------- ---------- ---------- ---------- ----------

AVGIOTIM LSTIOTIM MINIOTIM MAXIORTM MAXIOWTM ---------- ---------- ---------- ---------- ----------

3 13007 68126 13007 68126 0 0

0 0 0 0 0

This just shows again (well, I'm assuming it would if I ran this before and after the problem times) that it's a "bursty" problem. But thanks, it's something to keep in mind as changes are made.

> Disadvantages, you have the possibility of RBS dynamically
> allocating/deallocating extents on the fly (if too frequent, you really need
> to increase optimal), you increase the odds of 1555s (if too frequent, again
> you really need to increase optimal). You can also use the set transaction
> command on a large RBS to reduce the incidence of your smaller RBSs having
> to cope with large transaction.
>

Turning off optimal seemed to make the most difference with 1555s. But that also makes me think I want more RBS's, even though the normal formulae say no.

A very large optimal would be worse than just ocassionally checking the rollstat, eh?

> >
> > At some point, I've read most all the stuff out there on the subject,
> > but most of it is buried deep within the folds of my brain, haven't
> > refreshed recently. Pointers gladly accepted.
>
> My thoughts for what they worth.
>
> Cheers
>
> Richard
>
> >
> > 8.1.7 on 4 processor K570, hp-ux 11.0, some kinda RAID which may be
> > changing shortly. Generally processor bound, will be converting to a
> > bigger and faster box. Packaged "Thin client" finance/manufacturing
> > system with some customization.
> >
> > open_cursors = 7000
> > db_block_buffers = 73728
> > shared_pool_size = 78643200
> > large_pool_size = 614400
> > java_pool_size = 0
> > log_checkpoint_interval = 10000
> > log_checkpoint_timeout = 1800
> > processes = 300
> > db_block_size = 8192
> >
> > No other strange init.ora's set. Yes, open_cursors must be that.
> >
> > TIA
> >
> > jg
> > --
> > @home is bogus.
> > http://www.garry.to

Thanks, Richard. No matter how much one knows, it always helps to keep the basics in sight.

--
@home is bogus
http://www.garry.to

Received on Fri Nov 22 2002 - 18:27:20 CST