4NF is Where It Is At! [WAS Re: 1:1 relationships]

From: Barry <BarryJJ_at_ATTGlobal.Net>
Date: Wed, 24 Jan 2001 22:55:43 -0500
Message-ID: <3a6fa5e6_3_at_news1.prserv.net>


Joćo Martiniano wrote:

> Hi.
> Thanks for the replys.
>
> I guess that it does make sense putting all the attributes in one table,
> though... on the other hand it doesn't feel right. I read in an older post
> (12/12/2000, by Jan Lenders), this:
>
> (i hope it is not bad netiquette pasting fragments of other people's
> posts; if it is, my apologies)
>
> "> > Some programmers will insist you build just one table to keep things

> > > 'simple'. Later they'll find some unintended use for the bogus
> > > relationships, sooner or later it's bound to happen. The single,
 simple
> > > table won't seem so simple.

Absolutely correct.

I had the experience of building a small model for a couple of programmers once that they declared was "too complicated". I asked them to offer what they thought was appropriate. Then I questioned unsatisfied requirements they had given me ... and watched as they "oh, yea, well if we change this ..." eventually transformed their model into my original offering. The table structure was *exactly* the same ... but I think they had a couple of "indicator" fields instead of my "type" fields or some such, which is why their model was better than mine.

I agreed to implement their model ... but would they mind if I renamed a couple of things to satisfy our naming conventions ... and then went and ran my original table creation scripts :-)

> > When data modelling you shouldn't listen to programmers, you should
> > listen to users. :-)

Certainly give more credence to the users than the programmers. Users have a vested interest (hopefully) in getting their problem solved. Programmers have an interest - although I would *not* characterise it as a "vested" interest - in simplifications that might make it easier for them to get the code out faster.

But users don't always know what they want with enough clarity *either*. Both groups tend to focus on the most common cases with which they are familiar.

For users, that's OK. Humans are wonderfully adaptable to the exceptions when they arise.

For programmers, that's OK, they can patch their programs and put the new version into production.

But restructure a data base is *difficult*, coz it can easily impact lots more than one program or function. That's why *we* have to get the data model right - as best we can!? - the first time. And that means we have to dig into the "exceptions" up front.

> I agree; if there is a reason why two entities appear in the conceptual
> model in a 1-to-1 relationship the end user must have considered them
> as being different things.
> As a data modeler it is your job to know *why* this is true."

And this is where I've concluded 4th Normal Form is actually the most important of all. At the risk of being flamed for overly-simplifying it, 4NF asks: do all non-key attributes have the *same* relationship to the key? If not, then split the non-key attributes apart.

Perhaps you will know the person before you know the car? Or the car will change on a different timeframe than the other Person details?

I was once working with a state agency on a land information system. They insisted that a change-of-ownership (COO) *always* had to be accompanied by a property valuation ... although valuations were also expected at least every 5 years as well to support taxing requirements. So the COO table need valuation details *and* we needed a separate table for the other valuations.

"Was there *never* the possibility for selling a property without an accompanying valuation?" I asked, looking for the reason to just link the two. "Nope, absolutely never, against the law ... ummm, unless I meant, like, when the government itself sold property??" Duhh.

Sometimes you've just got to poke ... and probe ... and push ... and *that's* the skill to be acquired with experience. Writing CREATE TABLE statements is easy. And Normalisation, *especially* 4NF, has served me well since I first grasped it ~20 years ago!

Note that by modelling to 4NF, you can easily add history to a data base, by simply timestamping the reation. (Well, it's doubtless easier for the data modeller than the programmer!) After all, if things don't have the same relationship to the key, then they must change values at different times. That, in turn, is the path to "INSERT only" data bases which, for some sensitive applications, adds a nice auditability dimension to the thing.

(It also leads to yearning that Temporal data bases appear sooner, not later, but ...)

Another non-4NF flag for me is seeing NULLS ALLOWED columns. Obviously I don't know the value in that column at the same time that I know others, so I split it out ... and whether I know the value is *always* determined by whether a row exists *only*, and *not* by whether the row exists sometimes and whether a column is NULL or not in other cases depending ... I think it brings a certain "elegance" to a model. But yes, I still make exceptions to this rule but not *too* many.

> Well, i would like to hear more opinions, because i'm not a DB specialist,
> and i'm not very sure of anything.

Good attitude to take. It presumably means you are open to the important ideas and concepts you need to consider. And for that reason I've been more ... long-winded?! I'll get off my "soapbox" now ... for now ...

HTH ... Barry J. Received on Thu Jan 25 2001 - 04:55:43 CET

Original text of this message