Re: Object databases beat joins (was: Re: ODMG Website?)

From: Carl Rosenberger <carl_at_db4o.com>
Date: Sat, 17 May 2003 19:59:17 +0200
Message-ID: <ba5sji$vih$00$1_at_news.t-online.com>


Mikito Harakiri wrote:
> > Imagine you have a database with 2 billion of hydrogen atoms, joined
> > together to one billion of hydrogen molecules. If you have one atom
> > and want to find the corresponding one, your index solution will have
> > to walk the complete tree consisting of 2 billion of index entries.
> > They may not fit into RAM, I am afraid. Alright then, if you have
> > a perfectly balanced tree on your hard disk, you will need at least
> > 31 (2 ^ 31 = 2147483648) reads to your disk and 31 compare operations.
> > An object database typically references objects directly. Access time
> > is constant:
> > Walk one pointer!
> > ...no matter if you have 10 atoms, 2 billion, or 42 fantastillions.
> > You can get away with one singel read operation.
> > That's the best performance you can get.
> >
> > In huge databases, index construction and maintenance can also become
> > a performance problem.
>
> Wrong arithmetics. A typical B-tree having 1G nodes will be 4 levels deep. 3
> levels would typically be cached. Therefore, we have 1 random IO, the same
> as in the OODB case where the foreigh key reference is expressed explicitly
> by a pointer.

How many comparison operations will you have on average?

I don't think the "we have 3 levels in the cache" argument will hold for the type of applications I was originally referring to: http://www.slac.stanford.edu/slac/media-info/20020412/database.html

How will your response times evolve with your B-tree approach, if you have more data? It will increase!

Direct pointer access performance is constant, flat, always the same: FAST!
...no matter how much data you store.

Agreed, my arithmetics on the number of read operations that you need with a B-Tree approach were wrong.

Some weirdos reading along are trying to be funny: "Yeah with clustering, we need zero access to the storage medium." In this case I guess, with good enough clustering a relational database can work without a storage medium. Gee, I wish I had such a Zen-database to carry around in the 5th dimension of my pocket.

For those of you (Adrian) arguing that a relational database can also implement foreign keys as pointers: In this case you are violating the no.1 relational principle that values (the logical representation) should be independant of the physical representation.

The advantage of using direct pointers in object databases, that works best for huge databases also works best for limited ressources and limited storage capacities: PDAs, mobile phones, embedded devices, JavaCard

An object database that uses direct pointers to allow navigating through objects will achieve the best performance with the least ressource consumption. Indices will not be necessary.

Kind regards,
Carl

--
Carl Rosenberger
db4o - database for objects - http://www.db4o.com
Received on Sat May 17 2003 - 19:59:17 CEST

Original text of this message