Re: TRM - Morbidity has set in, or not?

From: Frank Hamersley <terabitemightbe_at_bigpond.com>
Date: Tue, 16 May 2006 05:52:47 GMT
Message-ID: <PMdag.4787$S7.88_at_news-server.bigpond.net.au>


J M Davitt wrote:
> Paul Mansour wrote:

>> J M Davitt wrote regarding inverted indexes:
 >>
>>> This has huge significance: all values in a date domain covering
>>> hundreds of years require fewer than 100,000 values.  Time-of-day
>>> precise to a second requires only 86,000 values.  Given these domains,
>>> adding records to a system would require no new values -- the domains
>>> can be established before the first data arrive
>>

[..]

> TRM, on the other hand, would maintain exactly one ordered set of
> values for the domain and everything referencing the same date
> would refer to the same value. Indices aren't really needed. Index
> maintenance - the dreaded B-tree "rotate the root" operation - would
> never occur. Sure, as birth dates are corrected and licenses are
> renewed, the value a given record refers to would change -- but the
> values remain undisturbed and there's no need for index maintenance.

That seems OK - without presupposing the physical form of the mapping structures, it is akin to the MS Access et al surrogate key decomposition into "domain" tables. This is acceptable at the physical level but quite theoretically suspect at the logical level where it is commonly seen.

> There is, of course, a trade-off: the record reconstruction table
> has to be maintained. That's significant work and the techniques for
> doing it efficiently are, AFAIK, a closely-held secret. (Not every-
> thing's covered by the patent, you know. When you apply for a patent,
> you have to tell the world how you did it. Some of the most
> profitable industrial secrets are not patented.)

No slight intended, but they wouldn't be secrets if they were patented :-).

Does the TRM patent (or related writings) hint as to how this record reconstruction (RR) occurs in when performing queries framed at the logical level?

Thinking aloud I am guessing for some queries eg. "this_date = that_date" you don't even have to refer to the "domain" elements because equivalence implies an identical physical "pointer" (a term of convenience, perhaps there is a better choice?). Similarly "this_date > that_date" can be abridged if the domain physical pointers themselves are ordered. Therefore if all the supported operations can be performed at the physical level then RR can be deferred to the last step of materialising the outcome.

One question that pops up though - how do you represent a relation where no single attribute is a candidate key. eg. [Competitor;Date;Score]?

Cheers, Frank Received on Tue May 16 2006 - 07:52:47 CEST

Original text of this message