Re: How to "Defragment" Tablespaces ?

From: Frank Hubeny <fhubeny_at_ntsource.com>
Date: Sun, 15 Apr 2001 00:22:01 -0700
Message-ID: <3AD94C18.F6B1D35@ntsource.com>

There is an article by Juan Laoiza on http://technet.oracle.com/deploy/performance/ called Optimal Storage Configuration Made Easy that appears relevant to the discussion.

He promotes striping at 1 megabyte widths across all the available disks and putting all files on all of these disks. The most frequently used data would be placed on the outer quadrants of the disks to increase performance for both random and sequential accesses by reducing the time to position to the data. The goal is to use all of the capacity of large disks and still reach high performance if multiple cpus are available.

As I recall the notion of separating indexes from tables in order to improve IO performance does not even come up.

Does anyone have experience implementing Laoiza's recommendations?

Frank Hubeny

Ricky Sanchez wrote:

> Brian-
>
> This thread is reaching epic proportions. Regarding the extent size suggestion, that harks
> back to the original question, about fragmentation. Objects can sensibly be placed in the
> same tablespace if they use a common and uniform extent size and thereby avoid concerns
> with fragmentation - particularly "stranded" extents. Every extent given up by an object is
> usable by another object. Not an IO issue, rather a space management convenience.
>
> Regarding the current matter, yes it is probable that where you find activity on a table
> you will probably find activity on its corresponding indexes. My key assertion here is that
> it is the resultant IO on such objects that is of concern, not their logical relationship.
> So, while you might guess that accessing a table will follow access to the index, it is
> only a guess as to how much contention there will be. Instead of supposing anything, simply
> measure the IO.
>
> When an index is accessed as you suggest, block at a time, there are normally several
> random accesses to get to a particular leaf block. Then, the table block access is also
> random. So, a nested loops type of access involves index blocks followed by table blocks.
> Often the same blocks are accessed repeatedly in a large nested loops join.
>
> You mention multiple accesses at the same time. A table and its index are never actually
> accessed at the same time. If there is concurrent access to objects, it is from separate
> users, probably getting to different blocks anyway.
>
> You also discuss dependency between tables and indexes. That dependency is logical, not
> physical. And, as I suggest, never really concurrent.
>
> Since the IO is randomly spread over the device anyway, the real issue is saturation of the
> IO capability of the device in question. As I said, you might guess that IO on an index is
> more or less the same as the logically related table, but it would be only a guess. My
> suggestion is to measure actual IO on objects and balance them by the numbers.
>
> I appreciate your point of view, Brian. Perhaps one day we will have time to further
> explore this, maybe after a nice exhaustive benchmark. In the meantime, best regards.
>
> - ricky
>
> Brian Peasland wrote: [stuff snipped]
>
> >
> > And I should also point out that I agree with you on load balancing the
> > I/O.
> >
> >
> > What does having the same extent sizes have to do with the reason why
> > indexes and tables got in the same tablespace to begin with? Their
> > extent sizes have nothing to do with *why* they are in that tablespace.
> >
> >
> > So you have your points on load balancing, but there is still something
> > missing. For this next example, I'm going to assume that accessing the
> > data from the table is done with an index. No full table scans and no
> > Fast Full Scans on the index. I'm also going to assume that the blocks
> > on disk are not yet in the buffer cache. They have been aged out.
> > Otherwise, this whole exercise is rather moot.
> >
> > In order to satisfy the query, I have to read the index, read the table,
> > read the index, read the table, etc. By going back and forth, this
> > involves the dreaded seek time. It is compounded even more when I have
> > multiple users accessing the same objects (but not necessarily the same
> > blocks). If I separate the index from the table, then User1 can read the
> > index. While User1 is reading the table, then User2 can be reading the
> > index. When User2 goes to read the table, then User1 can be reading the
> > index again. If both the table and index are on the same disk, then they
> > will both contend with one another. If you separate the index from the
> > table, then their contention for the same resource (access to the data)
> > is lessened.
> >
> > In your case about load balancing, you never take this into account.
> > This is because you only look about how often an object is incurring
> > I/O, not how often two object incur I/O *at the same time*! In your
> > case, you are only separating on the basis of the total I/O operations
> >
> > Your method of balancing solely on I/O also is greatly predicated on the
> > fact that each object is accessed independantly of each other. While
> > tables can be independent of each other, there is a great dependence
> > between the index and it's associated table. Unless you are doing a full
> > table scan or fast full scans on the index, then reading the table will
> > also read the index. They are accessed *together*. Since they are
> > accessed together, it makes sense to separate them from each other.
> >
Received on Sun Apr 15 2001 - 02:22:01 CDT