Re: Newbie question about db normalization theory: redundant keys OK?

From: David Cressey <cressey73_at_verizon.net>
Date: Sat, 15 Dec 2007 11:50:47 GMT
Message-ID: <rcP8j.57$Vg1.23_at_trndny04>


"raylopez99" <raylopez99_at_yahoo.com> wrote in message news:6b05e167-81c3-4fe6-b577-47f30c4a1350_at_d27g2000prf.googlegroups.com...
> On Dec 14, 12:30 pm, -CELKO- <jcelko..._at_earthlink.net> wrote:
> > >> If teachers are permitted to change their names, then names are not
appropriate keys for teachers. <<
> >
> > Agreed; this was a skeleton table to demonstrate to the OP that tables
> > can have redundant or even overlapping keys to preserve data
> > integrity. Tony just gets easily side tracked and I am afraid I tend
> > to fall for it myself. Sorry for the thread drift ..
> >
> > Instead of a teacher's name, we can tattoo a permanent bar code on the
> > foreheads of the faculty, using their tax identification number :)
>
> Celko what do you think of 1NF, 2NF, 3NF as programming exercises?
> (see here for my understanding of these concepts:
>
http://www.utexas.edu/its-archive/windows/database/datamodeling/rm/rm7.html
> ) Obviously for 'cascades' of UPDATE changes and the like, 3NF is the
> easiest to code, but getting a dB into 3NF form if it's in 2NF or 1NF
> form is a bit of work, agreed? The U of Texas example was easy to
> follow but in the real world I doubt it's so cut and dry.
>
> Keep in mind I have a few days experience in this matter, but I think
> I'm on the right track with the above statement.
>

I can't generalize, but in my own experience, the difference between 2NF and 3NF had a real, practical consequence on my data design within weeks of when I learned it. I learned 1NF, 2NF, and 3NF on my way to learning about databases, back in about 1984.

I had just learned how to use indexed files, and I was fortunate to have a mentor about database matters. He gave me a magazine article about how to use normalization to good adavantage in an indexed files environment. It was an eye opener. I went back and looked at my design, and all of my files were in 2NF, but one was not in 3NF. I'm sorry I can't recall the details 23 years later. The general environment was using enrollment data from customer training courses to prepare management reports for training center managers.

The one that was not in 3NF was causing me headaches in writing programs to keep the data up to date! A quick redesign and the problem went away!

I'm using this as an illustration of the lemma: "Theory IS practical", from Date.

One thing you might look into with regard to the defs of 2NF and 3NF in the U Texas material is whether their definitions are strictly correct. In particular, the use of GUIDs might make a table appear to be in 2NF if you ignore other candidate keys (besides the primary key), but I don't think ignoring other candidate keys is legitimate. It certainly isn't legitimate in BCNF.

Now that I've realized that you plan on implementing in MS Access, I'll say you can probably ignore my comments regarding star schemas. I expect that, by the time you get to the point where a star schema would be of advantage to you, you'll already have migrated from MS Access to some other product.

The discussion concerning "teacher name" versus some artificial identifier in here is of little theoretical importance, but enormous practical importance. The mere cascading of keys on update doesn't solve some problems that arise in the real world. In particular, data exists outside the database, and the relationship between data that is under control of the DBMS and data that is beyond the reach of the DBMS can be important.

"It says in the database that Ms Jones teaches psych 101. But Fred McGillicudy got a grade of "D" in psych 101, and the note certifying the grade was signed by Ms Smith, not Ms Jones. I went back to the database, and there's a Ms Smith in there, but she has never taught Psych 101. The history records all show that Ms Jones taught Psych 101 in the semester when Mr McGillicudy took that course. What's going on?"

What's going on here is that Ms Smith taught psych 101, gave Fred a D, signed the the note certifying the grade, got married, changed her name to Ms Jones, the DBA dutifully changed her name to Ms Jones, and the DBMS dutifully changed all the foreign key references from Ms Smith to Ms Jones. A new Ms Smith has been hired, to teach art appreciation, and her name has been dutifully entered into the database.

But a fact has been lost, namely the fact that when Ms Jones taught the class that Fred got a D in, her name was Smith.

Use mutable keys, and you face this problem, unless you never deal with data that's outside the database. If you don't live in an ivory tower, that's not realistic.

> RL
Received on Sat Dec 15 2007 - 12:50:47 CET

Original text of this message