Re: Automatic Normalisation

From: Walter Mitty <wamitty_at_verizon.net>
Date: Thu, 23 Oct 2008 13:25:52 GMT
Message-ID: <AX_Lk.3220$r_3.2370_at_nwrddc02.gnilink.net>


"xyzzy" <google_at_m-streeter.demon.co.uk> wrote in message news:47fa5b61-b170-4509-a5aa-688bdfbd3b8f_at_c36g2000prc.googlegroups.com...
> The rules governing whether a given relvar coupled with dependency
> rules are in a normal form are well defined.
>
> Unfortunately, many of the FDs are actually business rules in the
> designer's head, not easily expressed in the form FD A -> B etc. The
> easiest way to normalise data is (still) using pen and paper AFAIK!
> Possibly sketching an ERD or something.
>
> If you can list the attributes you need to track, and if you can
> identify the attributes' inter-dependencies, it should be possible to
> deduce any desired NF mechanically from the rules.
>
> I have searched the forum (and the FAQ, which appears to have been a
> non-starter) without success. I Googled 'automatic normalisation' and
> it looks like some people (mainly academics) have had a go at this.
> My question is, does anybody in this forum know of, of have any
> experience with software that guides the designer to the correct
> solution. Are any of them worth trying out?
>
> Alternatively, if there isn't anything, why can't it be done?
>
> Thanks

What Bob said. In addition...

The term "normalization" is itself a loaded term, although it doesn't seem that way to most people who use it.

"Normalization" really refers to taking data that is not fully normalized, and transforming the schema so that it conforms to a higher normal form, while at the same time retaining the same information that the orginal schema retained. But the question ought to arise: why did you start with a schema that was less than fully normalized in the first place? The answers to this question are varied and complex.

In the 1980s, there were a lot of databases that were designed to replace file and record processing or even manual processing and data storage. In those cases, the schema might have come out of an attempt to imitate the existing solution, instead of a process of analysis and design from scratch. Manual systems and file and record systems are notoriously lacking in normalization. Either the designers were oblivious to normalization rules, or the benefits of adhering to them might have been seen as not worth the effort.

In the 1990s, there were a lot of databases designed to replace badly designed databases. Some of those badly designed databases suffered because of deviations from some normal forms. Normalizing the database schema made perfect sense in those cases.

You mentioned ER diagrams. Strictly speaking, normalization is irrelevant to an ER model. Relationships are identified, but not implemented, in ER modeling. For example, you can have a many-to-many relationship between students and courses by just indicating a single line between the "students" entity and the "courses" entity, and indicating that there can be many intatnces in both ends. It isn't until you convert to relations, or relational tables with foreign keys, that the design can be compared strictly to the normal forms. Some people include junction tables and foreign keys in their ER diagrams, but that is really relational modeling disguised in ER diagram conventions.

There are automatic techniques that will transform a "well formed" ER model into a relational model that is at least in 3NF. Several of the tools that allow managing ER models and "physical models" can carry out this tranformation automatically. By "well formed", I mean that each attribute has been connected to the correct entity or relationship. Determining correctness in this context involves looking at FDs in much the same way that a normalizer does, so some of the automation I'm talking about is more illusion than reality.

Finally, there are reasons why some designers depart from full normalization, while at the same time producing an acceptable, disciplined design. That's another topic. Received on Thu Oct 23 2008 - 15:25:52 CEST

Original text of this message