Re: Performance issue while loading large amount of data

From: Noons <nsouto_at_optusnet.com.au.nospam>
Date: 15 Jan 2003 14:23:56 GMT
Message-ID: <Xns9305BF2870F1mineminemine@210.49.20.254>

"Richard Foote" <richard.foote_at_bigpond.com> wrote in news:RdcV9.24793$jM5.65715_at_newsfeeds.bigpond.com and I quote:

> determine what the hell Oracle does, then hey hang in there, this might
> useful to you.

Good stuff, Richard! Thanks a lot.

>
> START_UBAFIL UBAFIL START_UBABLK UBABLK USED_UBLK USED_UREC
> ------------ ---------- ------------ ---------- ---------- ----------
> 2 2 1330 1345 15 760

Ahha! Great stuff. Easy to locate the undo blocks now. So that's where it was hiding...

>
> ** the above is now actually the undo data itself !! And what do we find
> ? Oracle actually only stores the row directory data for each inserted
> row as the other data is already known.

Bingo! I knew something was amiss with just the rowids or just the headers, as indicated in my earlier reply. This makes perfect sense. Thanks for clearing this one.

> a couple of bytes for each inserted record. That's how Oracle manages to
> keep the generated undo so small *IF* (big IF) it uses an insert array
> when performing the insert. No rowid as such although admittedly all the
> "portions" of the rowid are recorded somewhere above but not the whole
> rowid for each and every row.

Indeed not once in all this is the rowid ACTUALLY ever saved in undo on an INSERT.
Unlike what the book and the doco imply. It just couldn't be, the numbers were too way out.

The problem was in finding out what *exactly* was stored. Which you have solved.

This is indeed VERY surprising and unexpected. Makes sense though and explains a lot of the weird proportions we were getting. I never thought it would be the row dir for array inserts!

Oh well, one more point to reinforce that mass loads *need* arrays!

I wonder, as an aside: given the subquery is causing a form of array insert, what array size would it use? Fixed or variable, depending on block size?

>
> I've listed 2 undo records here, one for the index and one for the
> table. As you can see, there's a fair amount of undo that needs to be
> recorded for an index. Note also that indexes have branch pages, leaf
> pages that split etc. and all this needs to be recorded.

This would get much worse if the table/index was not empty before, I'd expect? Because then it might be recording b-tree branches and splits etc of the already existing index blocks, depending on what values of keys we were inserting. Correct?

>
> I now carry on and insert 340 rows into this table.

Hope you really got that into a script! :D

>
> I've only listed 3 undo records but as you can see each one corresponds
> to only the one row as we are not using an array insert in this example.
> This means that the large undo record header section is duplicated for
> each and every entry resulting in a vastly increased undo rate per row.

At last the header is saved... I thought I had been imagining things: it was in an old doco I had from another life at Oracle Corp, but I can't find the bugger anywhere!

What I find surprising here is the header being saved so many times. In the same transaction I can't see why that would be needed? I mean, it's not like we'd "partially" rollback, no? Just saving the original block's header once would do the job to reset the block in case of rollback? Or is it in case we have savepoints?

>
> Hopefully this shows that:
>
> 1) Yes undo is generate for each individual row that is inserted
> (hopefully that settles that debate)

Yup, agreed 100%. I just reviewed what I wrote before and my initial assertion of no undo at all was wrong. Should have been "a very small amount of undo". Which I hope I made clear later on.

It's the varying amounts that worried me after the testing. Sometimes very small, sometimes more significant. Don't like that kind of uncertainty. Much clearer now.

>
> 2) If an insert array is used, then proportionally only a small amount
> of undo is generated as only the row directory slot info is actually
> recorded for each row

This is surprising indeed. I expected the header saved, only once, in this case. Not the row dir slot. Makes absolute sense for bulk loads and matches the practical results to a "t".

>
> 3) If a conventional insert is used, then approximately 68 bytes per
> inserted row is used plus some (which hopefully will explain why
> sometimes much 'more' undo appears to be generated proportionally for
> inserts).

Yup, very clear now. And more important: predictable. It also explains some irregularities I got when I was doing one-by-one inserts to get blocks to fill up exactly.

>
> 4) Indexes use (proportionally) a lot of undo for inserts.

As expected. Never doubted it. Easiest thing to verify in a test. It's now much clearer why.

>
> Sorry if I've bored you all to tears but I hope some of this was useful
> to you and any newbies out there.

Bored? Darn bloody useful! Just saved us all a heap of time and got some good unexpected useful info to boot as well!

Thanks a lot for this, Richard. If any1 posts your Wednesday photo somewhere, let me know and I'll send "the boys" around to straighten them up! :D

-- 
Cheers
Nuno Souto
nsouto_at_optusnet.com.au.nospam

Received on Wed Jan 15 2003 - 08:23:56 CST