Re: Object databases beat joins (was: Re: ODMG Website?)
Date: Sat, 17 May 2003 19:59:17 +0200
Message-ID: <ba5sji$vih$00$1_at_news.t-online.com>
Mikito Harakiri wrote:
> > Imagine you have a database with 2 billion of hydrogen atoms, joined
> > together to one billion of hydrogen molecules. If you have one atom
> > and want to find the corresponding one, your index solution will have
> > to walk the complete tree consisting of 2 billion of index entries.
> > They may not fit into RAM, I am afraid. Alright then, if you have
> > a perfectly balanced tree on your hard disk, you will need at least
> > 31 (2 ^ 31 = 2147483648) reads to your disk and 31 compare operations.
> > An object database typically references objects directly. Access time
> > is constant:
> > Walk one pointer!
> > ...no matter if you have 10 atoms, 2 billion, or 42 fantastillions.
> > You can get away with one singel read operation.
> > That's the best performance you can get.
> >
> > In huge databases, index construction and maintenance can also become
> > a performance problem.
>
> Wrong arithmetics. A typical B-tree having 1G nodes will be 4 levels deep. 3
> levels would typically be cached. Therefore, we have 1 random IO, the same
> as in the OODB case where the foreigh key reference is expressed explicitly
> by a pointer.
How many comparison operations will you have on average?
I don't think the "we have 3 levels in the cache" argument will hold for the type of applications I was originally referring to: http://www.slac.stanford.edu/slac/media-info/20020412/database.html
How will your response times evolve with your B-tree approach, if you have more data? It will increase!
Direct pointer access performance is constant, flat, always the same:
FAST!
Agreed, my arithmetics on the number of read operations that you
need with a B-Tree approach were wrong.
...no matter how much data you store.
For those of you (Adrian) arguing that a relational database can also implement foreign keys as pointers: In this case you are violating the no.1 relational principle that values (the logical representation) should be independant of the physical representation.
The advantage of using direct pointers in object databases, that works best for huge databases also works best for limited ressources and limited storage capacities: PDAs, mobile phones, embedded devices, JavaCard
Kind regards,
Carl
-- Carl Rosenberger db4o - database for objects - http://www.db4o.comReceived on Sat May 17 2003 - 19:59:17 CEST