Re: sequential disk read speed

From: Brian Selzer <brian_at_selzer-software.com>
Date: Wed, 27 Aug 2008 22:47:50 -0400
Message-ID: <tlotk.19241$xZ.10433_at_nlpi070.nbdc.sbc.com>


"David BL" <davidbl_at_iinet.net.au> wrote in message news:b3a7632f-de18-46e8-8ce3-3c5aaf83d4b9_at_a3g2000prm.googlegroups.com...
> On Aug 24, 12:39 pm, "Brian Selzer" <br..._at_selzer-software.com> wrote:
>>
>> If you have a 100GB database and you put it on single
>> 100GB disk drive, your best average seek time is the average seek time of
>> the disk drive, but if you put the database on four 100GB disk drives,
>> the
>> the best average seek time will only be a fraction of the seek time of
>> the
>> single disk. Suppose that the full-stroke seek time on the 100GB disk is
>> 7ms and the track-to-track seek time is 1ms. Well, with four disks,
>> instead
>> of an average 4ms seek time, the individual seek time of each disk is
>> reduced to roughly 2.5ms
>
> Is this because less of the disk is actually being used so on a given
> platter the head doesn't have such a large range of tracks to move
> over?
>

Yes. And the bit density is generally greater at the outside of the platter, so it generally takes fewer tracks to store the same information there as opposed to near the center; consequently, simply dividing the difference of the full-stroke seek and the track-to-track seek by four is a perhaps overly conservative method of estimation. I want to stress that this is not just a hair-brained theory of mine: I've had significant success using this mechanism to boost performance. In one application, by installing a disk that was seven times the size required and creating a partition on the outer edge of the disk, performance improved by over 6000%: batch processes that had been taking over 25 hours to complete were finishing in under 25 minutes.

>> , and since there are four disks, the average seek
>> time for the disk subsystem is reduced to a quarter of that or roughly
>> .625ms.
>
> In order for the effective seek time to be reduced to a quarter the
> seeking must be independent. To achieve that I think the striping
> would need to be very coarse (eg 512kb or 1Mb).
>

Drives that support disconnection or some other command queueing mechanism are all that is needed for seeking to be independent.

I think using a coarse stripe is counterproductive. There would be a bigger chance that a seek in the middle of the read would be required. Consider: if 3.5 stripes fit on a track in one zone of the disk, then on average every fourth read would require an additional seek to get the remaining half stripe. If on the other hand, 28 stripes fit on a track, then no additional seeks would be necessary. Even if it were 28.5 stripes instead of 28, one additional seek for every 29 reads is a whole lot better than one for every 4. Received on Thu Aug 28 2008 - 04:47:50 CEST

Original text of this message