Re: Newbie question about db normalization theory: redundant keys OK?

From: David Cressey <cressey73_at_verizon.net>
Date: Thu, 13 Dec 2007 12:14:12 GMT
Message-ID: <om98j.3199$CJ.1332_at_trndny02>


"raylopez99" <raylopez99_at_yahoo.com> wrote in message news:15b312ea-1f66-4f22-abbb-63581e0eca73_at_x69g2000hsx.googlegroups.com...
> With a few hours of theory under my belt, I'd like to ask if there's
> ever a time that you don't want a completely normalized dB, that is, a
> normalized database being a dB that has no redundant information (my
> understanding of what a normalized database is).
>
> Or, is there ever a time that you want redundant keys (that is, the
> same keys in many different tables, that obviously are not linked (in
> a relationship) between two tables?). Having redundant attributes and/
> or keys seems to me a very lazy way of designing a database that
> doesn't require lots of initial thought, but of course you have to pay
> for it by meticulously "synching" all redundant keys to one another
> everytime there is a change in one of the redundant keys, so the keys
> don't drift and have different values.
>
> But is there ever a time you want to do this?
>
> THanks in advance
>
> RL

The answer is yes, there are times when a design is a good one, even if less than fully normalized. For each normalization form, there is a known set of anomalies that come up when you insert, update, or delete data in that form. If you are willing and able to program around those anomalies, and if the design yields benefits that justify that effort, it can be the right thing to do. Learning when to normalize is more subtle than learning how to normalize.

There is a particular form of database design, called "star schema" that yields good results when used in a data mart or data warehouse situation. A star schema mimics a multidimensional database in relational (or SQL) form. A star schema follows design rules of its own, and those rules sometimes contradict the rules of normalization. The up side of star schema is that it's very easy to use with report generators, or with OLAP tools like Cognos or Business Objects. The down side of star schema is that the process of keeping the data current involves some fairly intricate programming, and heavy use of computer resources.

Star schema, and other unnormalized or denormalized designs almost always cost more than they are worth when used in a high transaction operational setting, like OLTP.

Unfortunately, most deviations from normalization occur due to blunders, and not due to well considered design decisions. Many deviations from normalization occur because the designer is unfamiliar with some of the normal forms. Back when I was building databases, I only really knew 1NF, 2NF, and 3NF. Update anomalies due to deviations from BCNF and beyond were rare, but my design process would not have obviated them.

Another major cause of deviations from normalization is failure to understand the data. In particular, the functional dependendencies inherent in the data are not discovered during data analysis, and the design unknowingly violates normalization rules. By the time this is discovered, there is usually a large body of application code that is dependent of the bad design.

Sometimes, denormalized design is the reult of sheer pigheadedness. Received on Thu Dec 13 2007 - 13:14:12 CET

Original text of this message