Re: Storing derived and derivable data

From: Kenneth Downs <knode.wants.this_at_see.sigblock>
Date: Fri, 05 May 2006 19:07:30 -0400
Message-Id: <ue2ti3-u0a.ln1_at_pluto.downsfam.net>


dawn wrote:

>
> Interesting perspective. I think of SQL Views as permitting derived
> data, including a means to include an attribute from a function that
> accesses a stored procedure, for example.

If there is anyplace where we could nail down the precise formulation of how views support derived data, then c.d.t. is the place.

In my book, you begin with the ability to specify derived data, that is the "theory" part of it. Views can be used to implement some derived data, but not all. They have two drawbacks in practice. First, they become unwieldy as they get complex. Second, they cannot create new rows.

Another drawback is more of a balance question. Materializing derivations within the transaction lengthens the transaction, takes more space, and increases in some cases the chances of contention. But then the reads are fast. With a view, the transactions are kept shorter, at the price of an ever-increasing read cost as the complexity increases.

Finally, SQL itself is a piss-poor language in which to store the authoritative fundamental definition of derived data, so the view statement itself can't be your method of definition. You need a better language in which to define the variations, and then you decide whether to use a view as an implementation method.

>
> I like most of what you wrote in your normalization + automation
> column, with only minor quibbles about the normalization issue. I,
> too, believe that we should treat our metadata, including business
> rules, as data.

I wish I were more persuasive on this point. It seems so self-evident to me that I often find myself at a loss to explain it.

> Those writing database-independent software and those
> writing IDEs typically do this, each reinventing the wheel, of course.
> Obviously our theory needs to include derived data, but my read on the
> RM is that it does, just not "in base tables." So, it separates
> "non-derived" and derived along set lines, rather than along attribute
> lines. That is a mistake, in my opinion.

If I understand what you are saying, you want to see the derived data become attributes of the key, or, columns. Me too. In super-plain English, a derived value is just another column in the table.

>
> Anecdote: In an environment where derived data can be added to a
> relation, a SQL-trained data modeler wanted to normalize the derived
> data into the mix, at which point I had a little go-round on that with
> the upshot that now the derived data are not related to the key, the
> whole key, and nothing but the key. There is no need to normalize
> derived data. I think that might be one of your points here too.

I don't understand what each of you were proposing. How did he want to normalize the derived data?

>
> I do know that asking some relational modeling folks how to handle
> atomated columns (yup, that term works great!) puts them way outside
> their comfort zone where there should never be any redundancy in data.

yeah, they have forgotten what normalization is for.

> Of course it is that kind of thinking that prompted people to take the
> whole bloomin' database and port it to another one, reshaping it,
> deriving stuff, etc, for reporting purposes. Talk about redundancy!

Ha! LOL!

-- 
Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth_at_(Sec)ure(Dat)a(.com)
Received on Sat May 06 2006 - 01:07:30 CEST

Original text of this message