Re: Normalization and Derived Information
Date: Sun, 10 Oct 2004 17:22:23 -0400
Message-ID: <f69ckc.2j9.ln_at_mercury.downsfam.net>
Bill H wrote:
> Kenneth:
>
> "Kenneth Downs" <firstinit.lastname_at_lastnameplusfam.net> wrote in message
> news:8nhbkc.5d7.ln_at_mercury.downsfam.net...
>> Bill H wrote: >> >> > There are some excellent reasons for storing calculated values;
> especially
>> > in accounting and other money handling applications. In fact, one >> > might even go so far as to state it is advisable to maintain duplicate >> > data in order to effectuate financial error control. >> > >> >> Yes, my final fallback is the audit trail. Seeing each column in a chain
> of
>> calculations gives the auditors the warm-fuzzies, they hate it when you >> tell them, "Oh, the gross profit calc is OK, trust me." If I can make >> the customer happy and guarantee correctness, I'll do it and leave the >> quetion >> of theory to another day. In this particular response to Mr. Celko
> however
>> I am pretending that I never want to materialize anything, and trying to >> pursue that through to its final endpoint so that I can discover anything >> I've missed by materializing all these years.
>
> You make an interesting observation. Notice how denormalization has
> occured: the totals for a list of accounting transactions are kept in
> another table.
>
> We can derive the total by adding the list of transactions or we can
> simply
> look at the total field in another table. The audit trail could be
> defined as the list of transactions to support the total but the total
> could be
> derived from the sum of the transactions. On the other hand, if the total
> weren't derived in this manner auditing is enhanced.
>
> Bill
Well, there are four kinds of derivations I've defined for my own use, aggregates are only one of them. I define them as:
- GET or FETCH. eg: copying price from the items table into the order table. Probably the most common form, and most practitioners probably unaware they are denormalizing.
- EXTENSION. extended price = price * qty. A derivation within the row. Also very common.
- CASCADE. eg: Copying a value from a header to every line of the detail table, such as discount percent. Useful if you want to materialize and auditable proration.
- AGGREGATE. Any of sum, count, min, max or avg from the order details, such as "DETAIL_TOTAL".
A summary table, such as a sum of dollar sales by customer and item would require the extra table you describe, so an AGGREGATE may require a new table, but maybe it does not.
Also, you do not have to denormalize to make use of these, a data dictionary can store formulas of this form and generate view definitions. The problem with doing them as views is that you end up with some staggeringly complex view that become impossible to inspect visually, and you can end up with circular view references if you are not careful.
-- Kenneth Downs Use first initial plus last name at last name plus literal "fam.net" to email meReceived on Sun Oct 10 2004 - 23:22:23 CEST