Re: Storing derived and derivable data

From: Kenneth Downs <ken_at_secdat.com>
Date: Fri, 05 May 2006 00:00:05 -0400
Message-ID: <f7vqi3-aoo.ln1_at_pluto.downsfam.net>


dawn wrote:
> Is there database theory that includes identification of
>
> 1. the fact that values for an attribute either were or could have been
> derived?
> 2. how values for an attribute were derived?
> 3. how values for an attribute could have been derived?

As for the theory, I have never seen one, so I made up my own. It is described here:

http://docs.secdat.com/index.php?gppn=Review+of+Automations

There are no proofs, just desriptions. This means it is not a theory but only the suggestion of a theory. But the code is downloadable and it runs :)

>
> For example, if a system is to be written that accepts US zip codes and
> populates city and state based on the zip, storing all three values,
> must the code for the derivation and the fact that this is derived data
> be known only through the code?

My system would specify that this way:

table zips {

    column zip9 { primary_key: Y; }
    column { city; }
    foreign_key { states; }
}

table anytable {
   foreign_key { zips; }
   column city {
       automation_id: FETCH;
       auto_formula: zips.city;

   }
   column state {
       automation_id: FETCH;
       auto_formula: zips.state;

   }

}

In the Andromeda implementation any attempt to write to anytable.city or anytable.state would give the error "Direct writes not allowed to derived column -city-" (or state as the case may be).

If you wanted your program to know which columns were safe to write to, you look in the data dictionary. The Andromeda UI uses this to make columns read-only so things are clear to the user.

>
> If there are materialized attributes, such as a student GPA, where the
> data should never be collected by any application and should only be
> derived, is there a way to specify or even ensure that it is derived
> (then materialized) data? Is there any notation that works with
> derived, but stored, data any differently than any other attributes?

In Andromeda:

table students {

    column student_id { primary_key: Y; }     column credits_taken {

       automation_id: SUM;
       auto_formula: schedule.credits_taken;
    }
    column credits_earned {
       automation_id: SUM;
       auto_formula: schedule.credits_earned;
    }
    column cum_gpa {
      chain calc {
        test { compare: _at_credits_earned = 0; return: 0; }
        test { return: _at_credits_earned / @credits_taken; }
      }

    }

}

table schedule {

   foreign_key students { primary_key: Y; }
   foreign_key courses  { primary_key: Y; }
   column credits_taken {
     automation_id: FETCH;
     auto_formula: courses.credits;

   }
   column credits_earned { description: Numerical Grade; } }

>
> I also have not seen anything in conceptual modeling techniques,
> including ORM (I'm not an expert on that), to collect information about
> what is or can be derived from what in the conceptual model. Does
> anyone have suggestions in that area? Thanks. --dawn
>

My own experience is that it is best to leave the design of the tables, including the derivations, to human beings, since the operations requires judgement. Once you've got the designs, feed them to the builder to build/upgrade your database. Concentrate on automating the deterministic stuff, but don't bother trying to automate the stuff that requires judgement.

As for ORM, data and code are fundamentally different, trying to match your data tier to code tier by smashing one into the shape of the other smells bad on general principles. I have no reason to pursue it, I've found dictionary-based library techniques produce far more code in less time that accomplishes far more. Received on Fri May 05 2006 - 06:00:05 CEST

Original text of this message