Re: TRM - Morbidity has set in, or not?
Date: Sun, 14 May 2006 04:12:02 GMT
Message-ID: <m6y9g.3558$S7.1767_at_news-server.bigpond.net.au>
J M Davitt wrote:
[..]
> Like the "inverted hierarchy?" Yes, that is old enough to be well
> known. The TRM difference is that a value in a column appears
> exactly once -- no matter ho many times it appears in the
> representation. A further point not made is that each value need
> appear only once in a domain. In other words, if there are many
> columns holding date values, with the same value appearing in not
> only one but many columns, the value need to be stored only once in
> TRM.
>
> This has huge significance: all values in a date domain covering
> hundreds of years require fewer than 100,000 values. Time-of-day
> precise to a second requires only 86,000 values. Given these domains,
> adding records to a system would require no new values -- the domains
> can be established before the first data arrive Social security
> numbers? There are far fewer that 10^9 possible. License plates
> numbers? What, 36^6 or 36^8 -- times 50? That's not a big gulp.
> Names? Far fewer than one might think. If domains such as these
> are enumerated before the system requiring the database is turned on,
> it could conceivably operate for years without seeing a "new" value.
> The benefit in the physical layer -- which is where all commercial
> products now have trouble when "big data" come to the party -- are
> is that space required for storing values becomes a mere tiny fraction
> of what modern systems require. 1/1,000,000 is not an unreasonable
> expectation.
But none of this is completely novel - some DBMS systems use techniques like this for compression in the physical layer - and the beholder of the relation is not aware of it. AKAIK Teradata apparently uses this in limited form with fixed length strings. I suspect however they don't perform a universal gathering (like every date within a single database) for a domain.
[..]
Cheers, Frank. Received on Sun May 14 2006 - 06:12:02 CEST
