Path: dp-news.maxwell.syr.edu!spool.maxwell.syr.edu!news-spur1.maxwell.syr.edu!news.maxwell.syr.edu!postnews.google.com!i39g2000cwa.googlegroups.com!not-for-mail
From: "dawn" <dawnwolthuis@gmail.com>
Newsgroups: comp.databases.theory
Subject: Re: Storing derived and derivable data
Date: 5 May 2006 05:12:19 -0700
Organization: http://groups.google.com
Lines: 70
Message-ID: <1146831139.454696.295070@i39g2000cwa.googlegroups.com>
References: <1145622076.958951.174100@t31g2000cwb.googlegroups.com>
   <injqi3-q7o.ln1@pluto.downsfam.net>
NNTP-Posting-Host: 207.199.203.250
Mime-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
X-Trace: posting.google.com 1146831146 27764 127.0.0.1 (5 May 2006 12:12:26 GMT)
X-Complaints-To: groups-abuse@google.com
NNTP-Posting-Date: Fri, 5 May 2006 12:12:26 +0000 (UTC)
In-Reply-To: <injqi3-q7o.ln1@pluto.downsfam.net>
User-Agent: G2/0.2
X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.3) Gecko/20060426 Firefox/1.5.0.3,gzip(gfe),gzip(gfe)
Complaints-To: groups-abuse@google.com
Injection-Info: i39g2000cwa.googlegroups.com; posting-host=207.199.203.250;
   posting-account=fLxfag0AAADzmFPc_IBdM-stGY4WCEYa
Xref: dp-news.maxwell.syr.edu comp.databases.theory:39219


Kenneth Downs wrote:
> dawn wrote:
> > Is there database theory that includes identification of
> >
> > 1. the fact that values for an attribute either were or could have been
> > derived?
> > 2. how values for an attribute were derived?
> > 3. how values for an attribute could have been derived?
> >
>
> Dawn, hello!  Long time no see.

Welcome back.  Seems like you were out for the academic calendar.

> I saw your post and decided it was time to end my long exile, you have
> asked the questions that are at the heart of my company and project.
>
> In order to even begin to deal with this question, you have to deal with
> the 800lb gorilla called "Normalization", which forbids derived data.

Interesting perspective.  I think of SQL Views as permitting derived
data, including a means to include an attribute from a function that
accesses a stored procedure, for example.

> I
> wrote an essay on that some months ago, which is here:
>
> http://docs.secdat.com/index.php?gppn=Normalization+and+Automation
>
> The states basically that derived data is part of life, and since
> normalization can't deal with it, normalization is not enough to guide
> serious application development.  But you also can't throw it away,
> because it gives good things.

Great stuff!  I like the name you (or others?) gave this concept.
There is a distinction between derived columns/attributes and what you
called automated columns.  Very good.  That is what I'm asking about --
stored derived data, which I will now call "automated" (even if I call
them attributes instead of columns) instead of "stored derived" based
on

http://docs.secdat.com/index.php?gppn=Review+of+Automations

I like most of what you wrote in your normalization + automation
column, with only minor quibbles about the normalization issue.  I,
too, believe that we should treat our metadata, including business
rules, as data.  Those writing database-independent software and those
writing IDEs typically do this, each reinventing the wheel, of course.
Obviously our theory needs to include derived data, but my read on the
RM is that it does, just not "in base tables."  So, it separates
"non-derived" and derived along set lines, rather than along attribute
lines.  That is a mistake, in my opinion.

Anecdote: In an environment where derived data can be added to a
relation, a SQL-trained data modeler wanted to normalize the derived
data into the mix, at which point I had a little go-round on that with
the upshot that now the derived data are not related to the key, the
whole key, and nothing but the key.  There is no need to normalize
derived data.  I think that might be one of your points here too.

I do know that asking some relational modeling folks how to handle
atomated columns (yup, that term works great!) puts them way outside
their comfort zone where there should never be any redundancy in data.
Of course it is that kind of thinking that prompted people to take the
whole bloomin' database and port it to another one, reshaping it,
deriving stuff, etc, for reporting purposes.  Talk about redundancy!

Cheers!  --dawn

