Re: Reinventing the TransRelational Model?

From: falcon <shahbazc_at_gmail.com>
Date: 12 Apr 2006 16:45:05 -0700
Message-ID: <1144885505.430796.128280_at_g10g2000cwb.googlegroups.com>


My understanding of the UB Tree (that's the latest from Prof. Bayer as far as I know) is that he adds this zig-zag method on top of an ordinary B-Tree. The zig-zag method itself seems to be a scheme where certain attributes are decomposed into individual bits, those bits are then mixed together (first take the first bits from each attribute, then append the second bit from each attribute, etc.). There is, apparently some mathematical property which shows that these interlaced bits strings allow for for effecient searches. In other words, these interlaced bit strings allow fairly effecient multi-column indexes. Again, this is my very crude understand, I haven't studied this stuff in a year or two, so take this (barely coherent description) with a boulder of salt :)

paul c wrote:
> falcon wrote:
> > I have been trying to understand the TransRelational model myself. I
> > don't think simply having column based tables make the TR model.
> > Sybase IQ has column based table. KDB has column based tables (Dennis
> > Shasha calls them Arrables or array tables). MonetDB is not only based
> > on 'columns' but there is an actual open source implementation
> > (http://monetdb.cwi.nl).
> >
> > The most interesting research I have seen is something called DODO.
> > There are some papers:
> > --The Dodo Query Flattening System
> > (http://www.ub.utwente.nl/webdocs/ctit/1/0000010d.pdf )
> > --A general approach to query flattening
> > (http://db.cs.utwente.nl/Publications/PaperStore/db-utwente-42D3A95D.pdf
> > )
> > There are some others which I can't find right now.
> >
> > -Falcon
> >
>
> I too suspect whatever it does is not entirely specified in the patents.
> I vaguely remember something to the effect of the "method" not being
> affected by different implementations that "somebody skilled in the art"
> might think of. Apparently (I read somewhere) the patent lawyers avoid
> words like 'algorithm' because those can't be patented.
>
> Still from the patents, my limited understanding is that it certainly
> goes into more detail than just describing a column-based store. The
> zig-zag structure/method is certainly unusual but I don't know if it's
> novel. I also don't think that indexing all values is novel nor
> combining an index with links that replace the common row-based impl'ns.
> When I looked at trm from the point-of-view of a memory-based 'store',
> as opposed to the disk-based strawman that was critized, I could see no
> way to avoid some pretty heavy updating overhead, unless a lot more
> pointer plumbing was used, plus a fair amount of churning for
> projections on tables with lots of columns. My desire was to minimize
> storage as well as to avoid a garbage collection component. Maybe there
> are clever ways to deal with the updating overhead that are beyond me
> but I concluded that for ten million rows and hundred million values and
> a bunch of btrees, i could get away with less than 5 byte storage
> overhead per value, getting the 'adjacency' effect for free, albeit at
> increased cost to add or delete columns.
>
> (In the course of this I also had in mind to see if I could leverage the
> much-critized Intel segment feature, for example, if one's database is
> say, 200 MB in size, a pointer needs only 28 bits, not 32. From what I
> can understand, the current operating systems prevent one from using
> segments in appl'n code - what was originally a retrograde feature would
> be useful today if there was a little more foresight - too bad.)
>
> I also noticed that Mr. Bayer, co-inventor of the btree which was never
> patented, has applied for a patent on his clever new 'all-column' (my
> words) index - i forget the name but it might be 'z+tree'. Where would
> we be today if he'd patented the btree or if Mr. Hoare had patented
> quicksort?
>
> The ctree project seemed pretty derivative to me, for example it's
> occurred to lots of people to stick compression routines on top of
> indexes.
>
> pc
Received on Thu Apr 13 2006 - 01:45:05 CEST

Original text of this message