Re: how to build a database from scratch

From: paul c <toledobythesea_at_oohay.ac>
Date: Fri, 01 Dec 2006 14:26:16 GMT
Message-ID: <cYWbh.407027$R63.241582_at_pd7urf1no>


Joachim Pimiskern wrote:
> <ctx2002_at_gmail.com> schrieb:
>

>> search on google, you see a lot of info and books talking about how to 
>> build operating systems and compilers from scratch, but i have never 
>> seen any books talks about how to build a database from scratch. any 
>> one know this kind of book even exists?

>
>
> Most relational database systems are based
> upon Bayer trees.

That answer is so partial as to be misleading. Books about specific implementations have appeared from time to time but were not best-sellers. At one time, a very few university bookstores would stock one or two copies. The ones I knew of have since gone downhill and now concentrate on titles about stuff like Java (all it does is replicate in software what a fairly pedestrian conventional piece of hardware does).

Even though much from os/compiler theory applies, computer thinking in general is generally caught up in what platforms allow (mostly interface methods whose theory is either non-existent or unexplained) which distracts from realizing that dbms's are more central to such an obvious use of computers than most hardware/software designers have realized. For example, most dbms's have had to invent their own concurrency mechanisms because the ones supplied by the os's were extremely lacking in foresight.

There is one book that offers well-thought-out criteria for determining whether a dbms exploits relational theory in a logical / consistent way, it is by Hugh Darwen and C.J. Date, see thethirdmanifesto.com for links.

    In the past, I've seen others with different slants, such as pure set theory. As such it is a kind of recipe for building a dbms, but be warned, it assumes a reader who has a broad grasp of current theory. Some people say that C.J. Date's Introduction to Database Systems would still be valuable if everything but the lengthy references were removed.   It offers comments with each reference, which can save quite a lot of time.

The technical underpinnings of what is possible are only to be found from scattered sources (usually academic but sometimes commercial) such as the ACM TODS and others, concerning data structures, data models, concurrency theory, access methods, query techniques, hardware architecture/logic and more hirsute mathematical sources, set theory and such. There were some quite good AI books by people like Nilsenn (may have spelled his name wrong) back in the 1980's that in fact had a lot to do with database theory but I'd guess most are out-of-print since AI has gone out of mainstream fashion. Nothing wrong with google, because it will eventually point to all these sources if you spend enough time on it,

The thing is, unlike most of the commercial expositions provided by the several software monopolies hardly encourage a potential developer to define just what dbms means for his particular purpose, whereas a few months or years looking for sources will likely result in one becoming somewhat familiar with pretty much every topic in computer science.

Even if one takes the time for all that research, as a practical matter, I think the only way to come to one's own conclusions is to write several specific dbms apps, specific in the sense that they handle only one application and purposely omit many of the features found in commercial dbms. Then one will have a better idea of what one is up against.

I think what you really need, rather than one book, is a couple of alternative syllabi, maybe the academics here can suggest some that are more comprehensive than my scattered suggestions.

p Received on Fri Dec 01 2006 - 15:26:16 CET

Original text of this message