Re: sequential disk read speed

From: Tim X <timx_at_nospam.dev.null>
Date: Thu, 21 Aug 2008 19:29:26 +1000
Message-ID: <87ej4id8cp.fsf_at_lion.rapttech.com.au>


Darren <anon5874_at_yahoo.com> writes:

> I am learning about database systems, and I am reading a book called
> "Physical Database Design".
>
> It gets to a bit about a large sequential access (e.g. for a full
> table scan), and does the following:
>
> It says "Since most disk systems use prefetch buffers to speed up
> table scans, we
> assume a 64 KB prefetch block"
>
> So to calculate the time for a full table scan, it multiples the
> number of 64KB blocks by the time it takes to seek and read (2.02ms).
> In other words, it is seeking each 64KB block.
>
> Why can a disk only read 64KB at a time? Is this a valid assumption?
> Is this a disk limitation or a file system limitation?
>

This is a valid 'assumption'. It does not mean that this is all any disk can do. To make their point, the authors have to pick some value and 64k is as good as any other. In reality, it will depend on the hardware, the way data is written to the disk, the speed of the host, bus, type of data transfer technology, what the system is optimised for etc etc. Things actually get even more complex because most DBMS do a certain amount of their own caching as well. What the authors are attempting to do is provide a clear abstract explination that doesn't get overly complex. this is also the reason why whenever it comes to working out performance and tuning the system (at all levels) it is essential to use available tools and why most large enterprise level databases usually have tools to assist in calculating this type of thing.

I'm not familiar with the book in question, but I suspect they are about to explain how things like record sizes, available indexes etc can impact on performance and possibly show why often held belief that indexes always make things faster can be misleading or just plain wrong. If its a good book, it will emphasise the importance on gathering hard figures and stats in order to optimise performance and how dangerous 'rules' regarding optimisation and performance can be. It is partially due to the complexities and variations involved that you don't get many databases that can successfully tune for performance automatically.

Tim

-- 
tcross (at) rapttech dot com dot au
Received on Thu Aug 21 2008 - 11:29:26 CEST

Original text of this message