Re: how to build a database from scratch
Date: Fri, 08 Dec 2006 13:41:01 GMT
Message-ID: <NXdeh.444867$1T2.95427_at_pd7urf2no>
Sampo Syreeni wrote:
> On 2006-12-07, paul c wrote:
>
>> Physical pages were fed to logical access tasks that performed queries >> and updates solely to memory. Logical tasks would wait for physical >> pages if they were not in cache.
>
>
> Sure, but this doesn't explain how the problem with parallel updates to
> a B-tree is solved. For example, if you lock the whole tree and the
> logical task which acquired the lock blocks waiting for physical I/O,
> suddenly all of the other tasks are excluded from the tree (not to
> mention anything that is transitively dependent on them because of page
> sharing and 2PL ordering), no new I/O goes in flight, and the system
> takes a nasty hit in exposed disk latency. Suddenly it's not at all
> "fast". If you try to solve the problem by asynch prefetching the blocks
> and only locking the pages when they're physically present, you'll have
> to guarantee that two processes fetching the same page will eventually
> synchronize on the same copy, which means that you're just moving the
> problem and not solving it, while at the same time you're possibly tying
> a lot of stuff like the disk cache layout with your chosen locking
> scheme. The design also cannot handle transactions with working sets
> larger than the physical memory, and it's quite possible that even after
> that problem is fixed, the system will not regain its former stability
> under memory pressure. Et cetera.
> ...
This is getting further and further away from the original question (at least the single answer "use a btree" to the original question) which I understood to be - does a BTREE algorithm need to change to allow concurrency. The answer is no if tasks or processes independent from the btree ones are used, eg., a logical lock manager (one that locks values, not blocks) and a transaction's updates are flushed together.
>> Query tasks obeyed a strict 2PL protocol and multiple table updates >> were ordered to prevent deadlock.
>
>
> 2PL does make life easy, but you can't use it unless you know precisely
> what data is going to be touched by a transaction when it begins. When
> you have this information available, it's nice and it should be
> exploitable, but in a typical environment which also allows e.g.
> operational ad hoc reporting, that data is not known beforehand.
> ...
>> There was a small minority of applications that weren't amenable to a >> simpler system. Simpler systems should be the goal, they are almost >> always stronger, safer and more economical.
>
>
> Yes, but when you have to handle a certain, relatively common OLTP
> workload, your architecture simply chokes and cannot be amended in an
> easy and principled manner. There is also nothing in your argument that
> isn't already considered by the existing literature: when you know your
> workload a priori, it's well known that specialized architecture can
> lead to simpler and higher performing systems, DBMS's built specifically
> for short write transactions with small working sets already utilize
> techniques similar to what you're advocating, special purposing as a
> general principle is accepted by many, and software architecture has by
> no means been neglected by DBMS designers of the past.
> ...
A decent designer will know the workload a priori.
> It's just that we were talking the problems that those same DBMS
> designers have bumped into, and considered worth solving. You on the
> other hand are not solving them, but simply assuming them away.
>
>> (BTW, I don't understand why some DBMS people are so interested in >> parallel processors, they are mainly the hardware manufacturers' >> reaction to the increasing gap between processor and memory speeds. >> Not many real applications need parallel hardware.)
>
>
> Technically parallelism is necessary at least for latency hiding
> (memory, network and disk), avoidance of diminishing marginal returns in
> high speed serial operation (both CPU and I/O), redundancy and
> availability on OTC hardware which is designed for price at the cost of
> failing often, sharp and fast, and algorithmic cost amortization over
> large numbers of small, independent transactions. Application-wise,
> you'll need it for scalability to high throughputs in broad scale OLTP,
> for geographically dispersed operation, and for multiuser sharing. All
> of these are real, pressing issues in the breadwinner DBMS apps, like
> financial services, reservation and planning, and O&M.
>
> You should also note that the interest in parallelizability is not
> limited to explicitly parallel hardware. In a sense you can equate the
> study of parallelism with studying various kinds and levels of
> dependency between decomposed computations and computational substrates.
> This is why your average DBMS running on a single processor will already
> benefit greatly from the insight that peripherals operate in an
> inherently parallel fashion, and that this built-in parallelism can be
> exploited for higher performance. The same goes e.g. for treating set
> oriented operations as parallel processes which can be optimized under a
> common parallel framework.
Fair enough, but I see no reason why a btree algorithm needs to recognize latency.
p Received on Fri Dec 08 2006 - 14:41:01 CET