Re: Normalize until neat, Automate until Complete

From: Tony Andrews <andrewst_at_onetel.com>
Date: 25 Nov 2004 05:18:15 -0800
Message-ID: <1101388695.097038.261900_at_f14g2000cwb.googlegroups.com>


Kenneth Downs wrote:
> There are these kinds of column automation, in the order that I find
> them useful:

Most of these should be implemented by the DBMS - and many can be, even with today's SQL products. Where they cannot be, then you may consider the pros and cons of a DIY approach, but you need to be aware (as I am sure your are) of the implications for update performance and data integrity.

> 1) Fetch. Retrieval of a column from parent table to child
> table. Triggered by insert/update of foreign key in
> child table. e.g., items.price -> order_details.price.

This is the exception: the implication is that the order_details.price will remain the same even if the items.price is subsequently updated; that is something you would need to take care of yourself (or use a different database design where the items.price was in a history table of some sort).

> 2) Extend. Calculation of a column based on other columns
> within the row. e.g., extended_price = price * qty.

A view can do this - is it really worth STORING that derived value? If you really think it is, then a materialized view can do it (automatically, without any trigger code).

> 3) Aggregate. Any of sum, count, min, max, or avg of a child
> column. In reality, sum and count are easy to implement
> safely, avg can be done by composition, min and max
> present the most trouble at implementation time, so much so
> that right now I am staying away from them.

Materialized views can take care of this - declaratively, without any trigger code.

> 4) Distribution. Copy of a column from parent table to child
> table. Distinguished from a fetch because it is triggered
> by a change in the parent column. I actually do not implement
> this one, as together with Aggregates it can produce update
> loops, and that is where the trouble *really* starts once you
> get into the realities of implementation.

Views (join at runtime) or materialized view (stored join) can take care of this - again, without any trigger code.

For me, the significant difference in your approach is that YOU take responsibility for the integrity of redundantly stored data; I prefer that if redundancy is really required for performance, then the DBMS should take care of it via materialized views - just as it does with the redundant copies of data in indexes.

> This is the real clincher, that you can denormalize if you have a
rational
> method for doing so, and if you can protect against anomalies that
> normalization protects against. My rationale is:
>
> 1) Normalize user-entered data
> 2) Provide the general constraint that automated columns cannot be
modified
> by users in INSERT/UPDATE statements
> 3) Provide a systematic framework for extending the normalized
database,
> I contend that this framework is an *automation* framework, and
that
> the definitions above allow you to move forward.

I agree that if you really MUST denormalise then you need a good framework for doing it, along the lines you suggest. However, I would much prefer to leverage the power of the DBMS to do this for you. Received on Thu Nov 25 2004 - 14:18:15 CET

Original text of this message