RE: reduce table fragmentation by design

From: Mark W. Farnham <mwf_at_rsiz.com>
Date: Fri, 28 Jan 2011 12:03:36 -0500
Message-ID: <D2163CEA46FC48808A230BE3283BD76E_at_rsiz.com>



If you know the dynamic range of each numeric column you can add up the possible change maximum in length (basically delta 1 byte for each order of magnitude change in value is close enough, presuming you're not trying to save bytes by initializing zeros to null instead of zero). Since I don't know the variability of your data I cannot calculate an aggressive but safe percent free for you. Since all your columns are either the unchanging key values or numbers, it seems likely that a very aggressively small percent free would be useful for either ASSM or freelists.  

As for pctused with freelists management (for which they have created the acronym MSSM meaning Manual Segment Space Management, though there is nothing manual about it), I'm not sure you would benefit from putting blocks back on the freelist sooner.  

The default is 40 in all releases I'm aware of, and that means blocks become available to receive new rows (having previously been take off by inserts or updates eroding free space to less than the specified percent free) when updates or deletions bring the size utilization below 40 percent. 85 percent would put blocks much more aggressively back on the free list. Since in your original post the complaint was that index probes tended to bring back as many block reads as index probes, this will only make that worse. Since you purge by time, you might want the blocks to be completely empty before they go back on the freelist so the rows also age out and are purged together.  

That would mean a percent used setting so that the smallest possible row you have being deleted drags you below it. Given your earlier metrics your overall storage requirements for the table should float at about 52/45ths of the possible minimum, but rows will be decently clustered by time. Whether this produces a material benefit is something you would have to test from time to time.  

Whether it produces less scatter than ASSM is something you would also have to test case by case unless you had Oracle's exact algorithm in hand and knew enough about the skew in your daily average row insertion and command of the mathematics to make a prediction. I suppose you could check it retrospectively to see where rows are placed by time and figure out whether there was likely an advantage to moving to freelists.  

You've reported that maintenance windows are tough to find; should I interpret that as you use the application's delete protocol that may be row-by-row instead?  

I would think old rows tend to be quiescent, and that by pre-copying the bulk of the rows (perhaps days 45 old through 5 days old or whatever seems to a preponderance of updates), then very briefly having an outage to copy the last 4 days and scan for any changes in the 45 through 5 day window you could keep the outage quite brief. Then it wouldn't matter much whether you use ASSM or freelists.  

Regards,  

mwf


From: LS Cheng [mailto:exriscer_at_gmail.com] Sent: Friday, January 28, 2011 2:54 AM
To: Mark W. Farnham
Cc: Niall Litchfield; tim_at_evdbt.com; oracle-l_at_freelists.org Subject: Re: reduce table fragmentation by design  

Hi

We actully do the CTAS time to time, it reduces response time x8.

But it is become harder to ask for maintenance window in this application that is why I am thinking to do what I have mentioned.

The critical queries are querying by timestamp so may be ordering by the timestamp when insert should be enough. Your last point that non-direct inserts will still cause fragmentation is true, I do have suggested a once per month CTAS ordering by timestamp because shrink table does not solve 100% the honeycomb neither.

Lastly dont you think MSSM would be better suited in this case? If I set a high pctused such as 85 pctfree 10 the block reutilization should be better than ASSM. With ASSM the timestamp index clustering factor is always the same as the number of rows if I recall ASSM tends to reduce block contention by spreading DML in more blocks which tend to cause even more honeycomb problem.

In the long run partitioning or partitioning view is the way to go.

Thanks

--

LSC <snip>  

--

http://www.freelists.org/webpage/oracle-l Received on Fri Jan 28 2011 - 11:03:36 CST

Original text of this message