ignore me

From: Richard McAllister <rmacallister_at_kc.rr.com>
Date: Sat, 14 Apr 2001 18:29:26 GMT
Message-ID: <aG0C6.16833$n83.2384863@typhoon.kc.rr.com>

"Brian Peasland" <peasland_at_usgs.gov> wrote in message news:3AD45D1E.42E6E851_at_usgs.gov...
> Ricky,
>
> > Yes, it is true that placing high IO objects on the same drive will
increase
> > contention. We certainly agree on that. I think Osvaldo knew that too,
which is why
> > he split up his vehicles over two highways instead of one. And yes, the
vehicles were
> > different. One group was yellow, the other blue. Rosa's idea. Sort of
like tables and
> > indexes. Some are called tables, the rest are called indexes. And yes,
they each have
> > variations in how they are accessed. Sometimes as individual, random
(sort of) block
> > accesses, other times as multiblock accesses. But, from the standpoint
of
> > performance, they both incur IO. And since IO is the actual bottleneck,
IO is what we
> > need to measure and address. Load balancing, not feathers and crystals.
>
> True, they both incur I/O, but as I already pointed out, not all I/O is
> equal.... Surely you'll agree that there are multiple components to I/O.
> Three major factors contribute to the amount of time it takes to read
> data from disk. They are "seek time", "rotational delay", and "read
> time". Of those three components, "seek time" takes by far the longest
> amount of time (more than rotational delay and read time combined.
>
> And I should also point out that I agree with you on load balancing the
> I/O.
>
> > Let's imagine that it is a single fat drive, just to simplify things.
Further, the
> > indexes and tables all have the same extent sizes, which is how they got
in that
> > tablespace to begin with.
>
> What does having the same extent sizes have to do with the reason why
> indexes and tables got in the same tablespace to begin with? Their
> extent sizes have nothing to do with *why* they are in that tablespace.
>
> > "Yes", I would reply, "but it simply does not matter which you move. The
correct
> > method is to base your decision on actual IO incurred, since IO is what
you are
> > balancing." And that is the crux of the matter.
>
> So you have your points on load balancing, but there is still something
> missing. For this next example, I'm going to assume that accessing the
> data from the table is done with an index. No full table scans and no
> Fast Full Scans on the index. I'm also going to assume that the blocks
> on disk are not yet in the buffer cache. They have been aged out.
> Otherwise, this whole exercise is rather moot.
>
> In order to satisfy the query, I have to read the index, read the table,
> read the index, read the table, etc. By going back and forth, this
> involves the dreaded seek time. It is compounded even more when I have
> multiple users accessing the same objects (but not necessarily the same
> blocks). If I separate the index from the table, then User1 can read the
> index. While User1 is reading the table, then User2 can be reading the
> index. When User2 goes to read the table, then User1 can be reading the
> index again. If both the table and index are on the same disk, then they
> will both contend with one another. If you separate the index from the
> table, then their contention for the same resource (access to the data)
> is lessened.
>
> In your case about load balancing, you never take this into account.
> This is because you only look about how often an object is incurring
> I/O, not how often two object incur I/O *at the same time*! In your
> case, you are only separating on the basis of the total I/O operations
> for the object. So let's say that I have table_a, table_b (neither of
> which have indexes) and table_c with index_c. Now, table_a incurs
> 100,000 read/write operations per day. And table_b incurs 100,000
> read/write operations per day (or whatever time unit you want. it
> doesn't matter). Table_c incurs only 50,000 read/write operations per
> day and index_c incurs ony 50,000 read/write operations per day. If I
> spread these objects over 3 disk volumes (according to your method),
> then I will have table_a on one spindle (a total of 100k r/w), table_b
> on another spindle (a total of 100k r/w) and table_c with index_c on the
> last spindle (a total of 100k r/w). All is balanced by I/O, is it not?
> Don't you see the bottleneck here? I will get better throughput by
> placing the index on a separate drive from the table even if that
> "overloads" one of the spindles a little. This case can easily occur
> because table_a and table_b are summary tables where I look up all the
> information on an occasional basis. But table_c (with index_c) is a
> lookup table which I access often. This case does occur quite
> frequently.
>
> Your method of balancing solely on I/O also is greatly predicated on the
> fact that each object is accessed independantly of each other. While
> tables can be independent of each other, there is a great dependence
> between the index and it's associated table. Unless you are doing a full
> table scan or fast full scans on the index, then reading the table will
> also read the index. They are accessed *together*. Since they are
> accessed together, it makes sense to separate them from each other.
>
> And I have to cast your mind back... You originally said "Never worry
> about indexes and tables having to be on separate drives." I went back
> on Deja to get this quote from your original post into this thread. This
> is what started the whole thing in the first place. "NEVER"? That's a
> pretty strong statement to make.
>
> It seems to me that you had a good idea some time ago about I/O being
> the driving factor between load balancing, not object type driving load
> balancing. But this in and of itself is not set up correctly to debunk a
> "myth". I'm not saying that indexes and tables cannot peacefully coexist
> on the same drive. In fact, I do it on almost every one of my systems. I
> have indexes and tables on the same spindle all the time! But what I
> don't do is put a table's index on the same spindle (unless I have no
> other choice). For instance, I put table_a on disk1. I put table_b on
> disk2. I put index_a (for table_a) on disk2 and index_b on disk1.
>
> So there is my case. Lump it in with the rest!
> Brian
>
>
>
>
> --
> ========================================
> Brian Peasland
> Raytheons Systems at
> USGS EROS Data Center
> These opinions are my own and do not
> necessarily reflect the opinions of my
> company!
> ========================================
Received on Sat Apr 14 2001 - 13:29:26 CDT