Re: how to build a database from scratch

From: DBMS_Plumber <paul_geoffrey_brown_at_yahoo.com>
Date: 8 Dec 2006 20:33:16 -0800
Message-ID: <1165638796.541256.175020_at_80g2000cwy.googlegroups.com>


paul c wrote:
> The topic continues to change, without much explanation!

The thread is titled "how to build a database from scratch". The OP asked:

"....i have never
seen any books talks about how to build a database from scratch. any one know this kind of book even exists?

most books and info are talking about modeling data and sql language. "

This whole thread sub-tree is a response to Joachim Pimiskern saying:

"Most relational database systems are based upon Bayer trees."

and helpfully pointing us to the write-up on wikipedia.

My own mild follow up to Joachim was intended to warn folk that a B-Tree built the way wikipedia tells you too will be thrown out and re-written from scratch because of DBMS requirements.

> It's a two-way street you know, you must
> offer some information like some of the deep thinkers here who I
> regularly profit from, in return for your challenges, rather than
> demanding that your premises be accepted as fact.

  Halloween Problem.

  Table < ID INT PK, A INT>
  B-Tree < A INT >

     UPDATE Table
            SET A = A+5
      WHERE A > 10;

  The naive b-tree algorithm will 'update' each tuple in the B-Tree where A > 10 by moving the record forward in the B-Tree, to a point beyond the current scan point. Then the scan progresses, and 're-discovers' the record, with it's new (higher) value of A. Take the naive algorithms on wikipedia, and the scan will never terminate.

  What to do? Several solutions. You can descent the tree to find the end-point of the scan, mark it, and then new inserts after the end-point won't be considered. You can keep a list of all the new entries you add and check each time. The point is that any solution to this problem involves modifying the basic B-Tree algorithms in non-obvious ways, and space / time / coding trade-offs which are quite well understood.

[snip]

> Perhaps you
> could check the handle you are responding to? (EG., are you sending
> messages to me or to this group?

I am responding to a thread in the theory group. The theory we are discussing in this thread is not relational theory, But then, the group is not called comp.databases.relational theory. What we are talking about is a body of of powerful, generalized abstractions.

At the very least, I hope that the discussion has conveyed something of the very real challenges that DBMS implementors face. I can't wait to see someone talk about how to perform query processing! Received on Sat Dec 09 2006 - 05:33:16 CET

Original text of this message