Re: Object databases beat joins (was: Re: ODMG Website?)

From: Mikito Harakiri <mikharakiri_at_ywho.com>
Date: Fri, 16 May 2003 09:25:14 -0700
Message-ID: <9o8xa.11$vh6.27_at_news.oracle.com>


"Bob Badour" <bbadour_at_golden.net> wrote in message news:ZIYwa.88$226.18740260_at_mantis.golden.net...
> "Mikito Harakiri" <mikharakiri_at_ywho.com> wrote in message
> news:ewVwa.12$MU1.103_at_news.oracle.com...
> > In general, we can't be certain how many layers of indirection is
between
> > the data stored on disk and query output. We might think that blocks x
and
> y
> > are collocated, but filer has striping. Also, storing records with the
> same
> > join key value in the same block migh be good for that particular join
> > order, but may adversely affect other queries.
>
> Absolutely. And creating an index means there is redundant information
that
> must be maintained. And adding physical pointers means there are redundant
> structural artifacts to maintain. etc.

Short memory. You explained that once to me already: your idea is maintaining implicit pointers not collocating matching parent and child records together. [Physical] clustering is a confusing name for it.

> Perhaps your optimizer is, but I am not convinced this has to be.

Agreed. Optimizer decision to use implicit pointer vs. alternative doesn't seem to be a big deal. When refered to optimizer complexity, I had enormous task difficulty in mind: access paths combinatorial explosion, impossibility of deriving intermediate cardinalies correctly, etc, etc.

> Tell me, what is the physical difference between clustering 25 binary
> relations on a common key vs. storing a 26-ary relation with all non-key
> columns nullable? It seems to me they should have identical cost
estimation
> models.

Agreed. Evaluating cost of access path via pointer dereference is not harder than evaluating access path via join index.

Terminogy, again. Or, perhaps, you can illustrate why access through implicit pointer can be called clustering?

> > Finally, clustering is only important for sequential-read devices (aka
> > disks), and would progressively become less relevant as soon as
> > random-access persistent storage (solid state disks, etc) become more
> > common.
>
> It has potential benefits for any block-read device that operates with
> significant latency or at a significantly slower speed than the cpu.

My misunderstanding here, again. Within randomely-accessed storage model dereferencing [implicit] pointer is one operation, why navigating tree from the root to the leaf are several. Received on Fri May 16 2003 - 18:25:14 CEST

Original text of this message