Re: how to build a database from scratch

From: David Cressey <dcressey_at_verizon.net>
Date: Sat, 09 Dec 2006 14:17:28 GMT
Message-ID: <Yzzeh.720$_55.280_at_trndny09>


"DBMS_Plumber" <paul_geoffrey_brown_at_yahoo.com> wrote in message news:1165627470.476954.76390_at_16g2000cwy.googlegroups.com...

> Describe a B-Tree in language other than the language of data
> structures, algorithms, and procedural code. Describe its theoretical
> properties in terms other than those of complexity theory. We are not
> talking about how to implement 'a database' - but how to implement a
> bloody Database MANAGEMENT SYSTEM - to wit, a tool for implementing
> databases with.
>

...and this brings us back to the OP, who asked about books on building a database from scratch.
Did the OP really mean "how to build a database from scratch" or did he mean "how to build a DBMS from scratch".
If the OP ever clarified this, I missed the post.

Let's say that the real question is "how do you build a database from scratch", without using or building a DBMS. Well, you could use B-trees without necessarily implementing them yourself. Two examples should suffice:

One is to use one of the B-tree libraries that are out there, and build it into your application. The library package will make the B-tree persistent by storing it in a file. It may store many trees in one file, or one tree to a file, or whatever. You can use the B-tree to provide rapid lookup of records or sets of records in one or more files. The records contain the data that presumably make up the "database" you wanted to build from scratch.

A second way is to use an indexed file tool. When I built my first "database" (accepting the newsgroup definition of a database), I used VAX RMS files. DEC Rms files could be built as indexed files, and the indexes were, I believe, B-trees.

Another thing you might want to do is to combine data by restrict, project, join, intersect, and union. Actually, I mean the funcional equivalent of these operators, although they might go by very different names in a gived application development environement.

When I built my "first database" (really a collection of files), I used DEC Datatrieve. For a tool developed in the 1970s, Datatrive was surprisingly advanced. If I remember correctly, it did not have project or union when I started using it, but it wasn't hard to implement those two, when needed, in a traditional programming language.

I don't know what tools one would use today in place of Datatrieve, but every now and then there are discussions in this newsgroup of tools like "dataphor" and others. I'll leave discussion of those tools to those who have used them, or who want to learn about them. I just want to say that a tool that can do joins, intersections, and unions between two sets of data stored in different databases managed by mutually isolated DBMSes is a useful thing.

It's not clear to me that what you end up building is really a "database", regardless of how hilarious Gene thinks this is. It's also not clear to me that you are "starting from scratch".

While I'm on the subject, yes, I'll back off from defining a database as necessarily being shared. But I will say that a database MANAGEMENT SYSTEM that can't handle multiple concurrent transactions is not even a toy DBMS. And I'll also say that sharing is the raison d'etre of a real database. Of the numerous projects that used a database and never obtained the promised "bang for the buck" that is the subject of Dawn's lament, I'll suggest that a lot of them could have been implemented with files, equally well. Those provide a poor sample for an "empirical study" of the merits or demerits of using a DBMS. Received on Sat Dec 09 2006 - 15:17:28 CET

Original text of this message