Re: Newbie question about db normalization theory: redundant keys OK?
Date: Fri, 14 Dec 2007 08:50:14 -0800 (PST)
On Dec 14, 2:26 am, raylopez99 <raylope..._at_yahoo.com> wrote:
> On Dec 13, 2:10 pm, mAsterdam <mAster..._at_vrijdag.org> wrote:
> > raylopez99 schreef:
> > > With a few hours of theory under my belt, I'd like to ask if there's
> > > ever a time that you don't want a completely normalized dB, that is, a
> > > normalized database being a dB that has no redundant information (my
> > > understanding of what a normalized database is).
> > This question is very well put.
> > First, about your understanding.
> > The goal of normalizing is not to get rid of redundancies but to get
> > rid of update anomalies. If, when in the real world your database is
> > modeling one thing changes and you have to change several pieces of
> > data in your database to keep it up to date, this means you have an
> > update anomaly.
> > Several (mostly increasingly strict) normal forms exist to check if
> > your set of relations suffers from some anomaly with recepies of
> > changes to it to make it comply. Just google for normal form if you
> > didn't already.
> Thanks; I did Google it,and learned a bit about 1NF, 2NF, 3NF here:http://defiant.yk.psu.edu/~lxn/IST_210/normal_form_definitions.html
>> While I didn't really understand the differences between the three types of NF <<
It gets a little tricky and a Newsgroup is not a good place to get the details. The idea of 1NF is that all the columns (attributes) are expressed as scalar values -- no sequences, arrays, linked lists or fancy stuff. This is a starting point. Maybe we should have had a "Zero Normal Form" for file systems which allow complicated structures :)
Something in 2NF has two or more "simple facts" hidden inside it. A simple example would be (teacher, student, dept) which hides the (teacher, dept) facts -- "Mr. Celko is in the CS department" and (student, dept) facts -- "Ms. Lohan is a CS student". Chris Date has done a good job of discussing the idea of RDBMS rows as facts and he is worth reading.
Something in 3NF has a key (one or more columns, no problem!) that determine all the non-key attributes. For example, (longitude, latitude, landmark_name) have a key (longitude, latitude). Notice that the pair is atomic (from the Greek "a-toma" meaning indivisible or literally "without parts"); the whole key is required to find your landmark.
After that, things get tricky. Ever hear of the "Eight Queens Problem"? Given a chess board and eight queens, put them on the board so that no queen can capture another. Classic programming problem. This can be modeled as (board_row, board_column, queen_nbr), but any two of the three determines the third column.
Stop that at once! You are not doing RDBMS at all and have no data integrity. That is another whole rant.
Star schemas work for Data Warehouses because the data coming into them came from normalized sources and it NEVER changes once in the DW. It is read-only and can suffer only an INSERT anomaly -- usually from trying to load the same data twice. Received on Fri Dec 14 2007 - 17:50:14 CET