Re: Storing derived and derivable data

From: dawn <dawnwolthuis_at_gmail.com>
Date: 5 May 2006 20:47:21 -0700
Message-ID: <1146887241.025807.311180_at_i40g2000cwc.googlegroups.com>


Kenneth Downs wrote:
> dawn wrote:
>
> >
> > Interesting perspective. I think of SQL Views as permitting derived
> > data, including a means to include an attribute from a function that
> > accesses a stored procedure, for example.
>
> If there is anyplace where we could nail down the precise formulation of how
> views support derived data, then c.d.t. is the place.

love the optimism :-)

>
> In my book, you begin with the ability to specify derived data, that is the
> "theory" part of it. Views can be used to implement some derived data, but
> not all. They have two drawbacks in practice. First, they become unwieldy
> as they get complex.

agreed

> Second, they cannot create new rows.

Hmm. I guess I do think about derived data as columns rather than rows. Other than aggregate rows, I'm having trouble coming up with derived rows -- can you give an example?

> Another drawback is more of a balance question. Materializing derivations
> within the transaction lengthens the transaction,

Yes.

> takes more space,

a whole lot less than if you port a lot of your data elsewhere for reporting

> and
> increases in some cases the chances of contention. But then the reads are
> fast. With a view, the transactions are kept shorter, at the price of an
> ever-increasing read cost as the complexity increases.
>
> Finally, SQL itself is a piss-poor language in which to store the
> authoritative fundamental definition of derived data,

I'm sure that's true. I did work on a project to define the language used to specify derived data before, but only at the very start where we could argue the approaches. Too much fun.

> so the view statement
> itself can't be your method of definition. You need a better language in
> which to define the variations, and then you decide whether to use a view
> as an implementation method.

Of course I don't use SQL views to implement anythiing (when not using a SQL DBMS)

> >
> > I like most of what you wrote in your normalization + automation
> > column, with only minor quibbles about the normalization issue. I,
> > too, believe that we should treat our metadata, including business
> > rules, as data.
>
> I wish I were more persuasive on this point. It seems so self-evident to me
> that I often find myself at a loss to explain it.
>
> > Those writing database-independent software and those
> > writing IDEs typically do this, each reinventing the wheel, of course.
> > Obviously our theory needs to include derived data, but my read on the
> > RM is that it does, just not "in base tables." So, it separates
> > "non-derived" and derived along set lines, rather than along attribute
> > lines. That is a mistake, in my opinion.
>
> If I understand what you are saying, you want to see the derived data become
> attributes of the key, or, columns. Me too. In super-plain English, a
> derived value is just another column in the table.

Yes. That is precisely how they are implemented in a database environment I'm trying not to mention.

> > Anecdote: In an environment where derived data can be added to a
> > relation, a SQL-trained data modeler wanted to normalize the derived
> > data into the mix, at which point I had a little go-round on that with
> > the upshot that now the derived data are not related to the key, the
> > whole key, and nothing but the key. There is no need to normalize
> > derived data. I think that might be one of your points here too.
>
> I don't understand what each of you were proposing. How did he want to
> normalize the derived data?

The specific case I recall is a "file" with a composite key, each of which being a dimension that one might use for slicing and dicing. The file didn't have much base data in this file that would be a relationship file in an ERD among the three parts of the key. I wanted to put derived data related to just one part of the key (each part) in the file. This turned the file into a virtual cube. The DBA said it was not normalized with virtual fields that were not related to the whole key. (He was a she, by the way)

> > I do know that asking some relational modeling folks how to handle
> > atomated columns (yup, that term works great!) puts them way outside
> > their comfort zone where there should never be any redundancy in data.
>
> yeah, they have forgotten what normalization is for.
>
> > Of course it is that kind of thinking that prompted people to take the
> > whole bloomin' database and port it to another one, reshaping it,
> > deriving stuff, etc, for reporting purposes. Talk about redundancy!
>
> Ha! LOL!

Good to have you back. Cheers! --dawn Received on Sat May 06 2006 - 05:47:21 CEST

Original text of this message