Re: database integrity

From: Marshall Spight <marshall.spight_at_gmail.com>
Date: 4 Jun 2005 12:28:35 -0700
Message-ID: <1117913309.990987.256580_at_g47g2000cwa.googlegroups.com>


> What makes it denormalized? Methought we had a temporal issue,
> speculation being committed as fact.

A fact isn't speculation because it's subject to change. A woman is known as Mary Smith while growing up, but might change her name to Mary Jones when she marries Mr. Jones on her 25th birthday. Would we say of her at age 21 that it is only speculation that her name is Mary Smith?

Likewise, we can look at the current regulations, and say that patent #1234, granted on 1/1/2000, is scheduled to expire on 1/1/2020. This isn't speculation about a condition that we claim will hold on a future date; it's a fact about when a future event is *currently* scheduled to take place.

But the real problem here is the fact that this data is *derived* data. The 1/1/2020 date is uniquely determined from two inputs: the 1/1/2000 date and the function that calculates expiry based on current legislation. The issue date is primary data; the expiry function is likewise primary; the expiration date is derived. The fact that we're storing data that can be computed from other data we're storing makes this denormalization. That one of the inputs that generates the derived date is a complex function and not a simple SQL datatype obscures this issue, but doesn't change its essential nature. Code is a type of data.

This doesn't become an issue of temporal data until and unless we have a business requirement to be able to answer questions such as, "on 1/1/2001, what was the date that patent #1234 was scheduled to expire?" In that case, we could store either past versions of the experation date, or past versions of the function. (The former would probably be easier given that we have only poor tools for supporting versions of functions.)

Marshall Received on Sat Jun 04 2005 - 21:28:35 CEST

Original text of this message