Re: Performance issue while loading large amount of data

From: Howard J. Rogers <howardjr2000_at_yahoo.com.au>
Date: Wed, 15 Jan 2003 06:41:02 +1100
Message-ID: <EzZU9.24111$jM5.64133@newsfeeds.bigpond.com>

"Noons" <nsouto_at_optusnet.com.au.nospam> wrote in message news:Xns93048DC96CDmineminemine_at_210.49.20.254...
> "Howard J. Rogers" <howardjr2000_at_yahoo.com.au> wrote in
> news:EREU9.23382$jM5.62333_at_newsfeeds.bigpond.com and I quote:
>
> >>
> >> What would those 6 bytes be?
> >
> > The rowid of each row being inserted (the "before image" of each new
> > row).
>
>
> Sorry, too variant to be the rowid. I mean: does the 6 bytes/row
> average size stay constant? I do recall Richard suggested
> around 100 bytes/row. I've just repeated some of your tests
> on my notebook with 8.1.7.4 and 8K block size and got 12 bytes/row!
> It's got to be something else other than rowids or else there
> wouldn't be such a huge variation.

The variation is inherent because we're counting blocks. When I did my insert of a 4-digit number into EMP, that took a whole undo block, so it presumably looks like 8K per row! But that's just because I wasn't filling up the block fully. When I repeated the test with a copy of DBA_OBJECTS, which has plenty of rows, the figure becomes more accurate. But you are always going to be left with a mostly-empty block at the end, which will skew the results. You will also have to take into account the fact that undo segment blocks have headers and transaction slots and so forth, reducing the amount of useable space available for the placing of undo.

When you count whole blocks, this sort of variability will unfortunately be inherent.

Now some new tests (see below) indicate (as you've argued) that block header information is *also* recorded, so the more blocks, the more rollback gets generated. So no, it's not *just* the rowid. But (and here's my point) it *includes* each individual rowid too.

>
> Besides, what would the rowid be in undo for? What exactly
> would that achieve on a commit or rollback? All that a
> rollback of an INSERT has to do is poke all the headers of
> blocks inserted into, back to original. And little else.
> Other than dict management, which we are abstracting here.

What's the before image of any DML for? So that you can re-construct what the block used to look like. If you insert one row into a table, chances are it will slot into a block which already has committed rows. By storing the rowid in rollback, you allow Oracle to determine which of the many possible rows it is that is to be reversed out when you say 'rollback'.

>
>
> It sure ain't gonna clean out the data section of every
> block. Look at a rollback from a huge INSERT: it takes no
> time at all. Versus a rollback from a big UPDATE, which
> can take a VERY significant amount of time.
>
>
> It's not like an update either, where there might be a need to
> multi-version the data: inserted data does not exist
> for anyone else until transaction commits.
>

But if you start a long-running query at 10.00am, and I modify a block you are about to query with an insert at 10.03, committed at 10.04 and your server process finally arrives at that block at 10.06, you aren't allowed to see my commit. So yes, we do have to a read-consistent read of that block, even though it was only modified by an insert, and the rollback is most definitely used for that.

[snip]
>
> I can't recall exactly what is changed,
> (and probably it would be wrong now in 9i anyway!)
> but there is a header change in an empty block
> when rows are inserted. That is what needs to be
> in undo. For the remaining inserts in that same
> block by this transaction there won't be another
> entry created in undo. All we want is the original
> header data so that the block can be quickly "rolled back".
>

But that won't work. When you have to prepare, at 10.06, a read-consistent image of a block that's been inserted into, it's not sufficient to roll the entire block to the way it was at 10.00am. You've got to roll some rows back, and not others. It must be at the row level. (Not to say that the header changes have to be recorded as well... I'm the first to admit that it's not *just* the rowid that's recorded. Have a look in v$logmnr_contents to see what sort of administrative overhead there is when *redo* is recorded).

> Do you think we should dump the data blocks to confirm
> the data area is not re-set, just the headers?
> I can probably do that sometime tomorrow in my notebook
> if you want (too late today, bedtime).
>

Go for it.

>
> >
> > (12 8K blocks for fewer records, instead of 40 4K blocks).
> >
> > Oh, and the amount of undo generated is most definitely proportional to
> > the number of records inserted (back to the 9i database!):
>
>
> I'd say proportional to the *number* of blocks that
> have received new data. Rather than the total amount of
> new data inserted.

OK, so how's this as a test:

SQL> create table bigthree pctfree 0 tablespace users 2 as select * from dba_objects where 1=2; Table created.

SQL> create table bigfour pctfree 99 pctused 0 tablespace users 2 as select * from dba_objects where 1=2; Table created.

SQL> insert into bigthree select * from dba_objects; 29362 rows created.

SQL> insert into bigfour select * from dba_objects; 29362 rows created.

SQL> analyze table bigthree compute statistics; Table analyzed.
SQL> analyze table bigfour compute statistics; Table analyzed.

SQL> select table_name, blocks from dba_tables where table_name = 'BIGTHREE' or table_name='BIGFOUR';

TABLE_NAME                         BLOCKS
------------------------------ ----------
BIGTHREE                              742
BIGFOUR                             29373

[So BIGTHREE has about 2.5% the number of blocks that BIGFOUR has.

SQL> insert into bigthree select * from dba_objects; 29362 rows created.
SQL> select used_ublk from v$transaction;

USED_UBLK

SQL> commit;
Commit complete.

SQL> insert into bigfour select * from dba_objects; 29362 rows created.
SQL> select used_ublk from v$transaction;

USED_UBLK

453

Now 39/453 = 8.6%. So clearly the block header business has a role to play, but it's not proportional.

>
> Try what I said before:
> Insert one row.
> Check undo use.
> Insert another row.
> Check undo use, stays constant.

That is not surprising. A transaction doesn't need any extra undo blocks until it's filled the first one completely. A transaction adjusts the head pointer in the rollback segment header block so as to completely reserve a block. No other transaction can make use of that block. But that transaction can make FULL use of that block. The fact that undo use stays constant doesn't indicate that it's only the first modification to the block header that's recorded, merely that there's plenty of room spare to store the other rowids.

> Insert quite a few rows. In my case
> I had to insert 40 or so before the
> first block of data was full.
> used_ublk was *still* the same, 1 block!
>
>
> Now: in my case,
> after inserting all objects I got
> used_ublk = 5 (3200 rows from all_objects).
> For 80 blocks of inserted data.
>
> That to me indicates the determining factor
> in amount of undo use in an INSERT is not the
> number of rowids generated.

See above. It's not the only factor, of course. But the rowids are definitely stored, because it's not just a linear relationship between number of blocks affected by the DML and the amount of undo generated. So it can't be just down to block headers.

[snip]
> OK. Undo generated to store a rowid like you suggest
> is related directly to the INSERT-a-row operation, no?
>
> Undo generated because block headers have changed
> is a side-effect of the INSERT. One for each block
> changed, not one for each row inserted. Same for
> undo generated by dict management (recursive stuff).
>

Don't forget that all data dictionary stuff gets put into the SYSTEM rollback segment, so it's irrelevant for the purposes of this discussion.

>
> >
> >> Nevertheless, it's still way too small to be any worry.
> >>
> >
> > Er, inserting 10 million rows in a table with no indexes whatsoever is
> > going to generate about 58 MEGAbytes of undo.
>
>
> If that. That's nothing, when the partition is nearly
> 20 Gb in size. 60Mb in 20Gb is not even worth
> worrying about.

As I said, it is if you've sized your rollback segments small. The need to acquire extra extents is going to slow things down, whether in DMT or LMT. And that supposes there's room in your tablespace to allow the thing to grow in the first place.

Incidentally, here's what Tom Kyte has to say (page 184 of Expert 1 on 1):

"An insert will generate the least amount of undo since all Oracle needs to record for this is row ID to 'delete'".

Just thought I'd mention it.

Time to start block dumping, I think.

Regards
HJR
> Even if you have 10Mb undo segment
> extent size, that's 6 extents. Piddle stuff.
> Once again, if the size of undo for INSERT is proportional
> to number of blocks actually added to table, then
> this would be highly dependent on block size.
>
> Of course if you want to worry about that, go ahead!
> Me, I had to worry about these guys wanting to keep
> 90 days of this 20Gb stuff on-line. But that's another
> story...
>
>
> > Slap some indexes on
> > before the load, and you can expect that to increase accordingly.
>
>
> Obviously. I never said INSERT into indexed generates
> little undo. In fact it won't increase accordingly:
> it will increase a LOT more!
>
> That is the whole point. Slap an UPDATE
> on your 20000 odd records and watch the undo use
> go through the roof, relatively speaking. Slap an index
> in there and do an INSERT or DELETE and watch the
> undo usage blow up as well. We are now storing much
> bigger undo images.
>
>
> Do just an INSERT without any indexes and it's
> largely irrelevant compared to any other change operation
> in that same data.
>
>
> > Nope. See above. No fundamental difference at all: inserts still
> > generate rollback/undo, regardless of version and regardless of the
> > nature of the tablespace, and regardless of whether you are using 9i
> > undo segments or 8i rollback segments.
>
>
> Yes, but very little compared to any other change operation.
>
>
>
> --
> Cheers
> Nuno Souto
> nsouto_at_optusnet.com.au.nospam
Received on Tue Jan 14 2003 - 13:41:02 CST