Re: MV Keys

From: dawn <dawnwolthuis_at_gmail.com>
Date: 1 Mar 2006 11:05:57 -0800
Message-ID: <1141239957.788010.239620_at_j33g2000cwa.googlegroups.com>


Jon Heggland wrote:
> In article <1141180942.125742.288830_at_j33g2000cwa.googlegroups.com>,
> marshall.spight_at_gmail.com says...
> > Jon Heggland wrote:
> > > marshall.spight_at_gmail.com says...
> > > >
> > > > Instead of doing this, we allow attributes of list type.
> > > > List! Special cased for character lists only, though,
> > > > perhaps so we can better convince ourselves that
> > > > we didn't just throw in a non-1NF construct. :-)
> > >
> > > What definition of 1NF do you use? I'm asking in earnest; there are
> > > quite a few different definitions floating around.
> >
> > I guess I use the definition that says 1NF means no
> > attributes of compound types. No list or set typed
> > attributes, that is.
>
> So compound == list or set? What about ordered pairs? Or fixed-size
> arrays? Where do you draw the line, and why? Is it the variable size
> that is important? Is VARCHAR compound (disregarding "convention"), but
> CHAR not?
>
> > (Only conventionally we have to
> > pretend that varchar is not a compound type if we
> > want to use this definition, because, practically
> > speaking, you can't live without strings.)
>
> Yes. But are strings/varchar then completely unproblematic? And if so,
> what makes other kinds of lists (or sets) theoretically problematic?
>
> > 1NF is not something that I think too much, though.
>
> Fair enough. Perhaps I should ask Dawn instead. Dawn, what definition of
> 1NF do you use? Why do *you* think list attributes breaks the RM?

I use it the same way that Codd uses the term "normalize" in his 1970 paper. He refers to removing "non-simple domains" which is not exactly tightly defined, but is used related to "repeating groups."

> > Can anyone name any update anomalies that are
> > solved by 1NF? It seems like most normal forms
> > have associated update anomalies that disappear
> > under the normal form.
>
> 1NF is really the odd man out of the common normal forms (2NF to 6NF,
> including BCNF).

Yes.

> The others have formal definitions, based on functional
> dependencies. 1NF is mostly informally defined (except for Date's
> definition, which however renders 1NF pointless and irrelevant---but
> then, so is 2NF, and arguably 3NF),

Yes, and this damages the def of all NF's because they are all defined first as requiring 1NF.

> and there is little agreement even
> on the informal definition. I have searched about a dozen textbooks on
> data modelling for 1NF definitions, and there are basically three camps:
>
> 1. "1NF means only atomic attributes." This of course begs the question:
> What does "atomic" mean, and why are strings and dates considered
> atomic? I have seen only one book with this definition that actually
> makes a point of this crucial question---and it has no good answer.
>
> 2. "1NF means no repeating groups of columns." I.e. no tables like this:
> Order(OrderID, CustomerID, Item1, Amount1, Item2, Amount2, Item3,
> Amount3, Item4, Amount4). The possible anomalies here are apparent, but
> the definition is still informal: It depends on the requirements of the
> database. If an order always has exactly four items, the design above is
> not that bad. And what about FootballMatch(MatchID, ..., Team1, Team2)?
> This definition also has the unfortunate property that a relvar can be
> in 5NF, but not in 1NF.

Although all NF's are defined as first requiring 1NF, at least with Codd and his camps.

> 3. "1NF is subsumed in the definition of a relation." Date's position,
> which means it is really pointless to talk about 1NF in any case. If a
> "relation" is not in 1NF, it isn't a relation, and normalisation doesn't
> apply.

I agree with Date tossing out normalization as defined by Codd, but not with his attempt to redefine 1NF to be meaningless. It is difficult to move foward if people still have the mistaken impression that database theory requires that we remove repeating groups.

> It is claimed by some that 1NF is required for the RM to "work". This is
> trivially the case if you use definition #3; it is obviously *not* the
> case if you use definition #2. It is more uncertain if you use
> definition #1, because nobody can define "atomic". My position is that
> there are no theoretical objections against "non-atomic" attributes
> (whatever it means), just pragmatical.
>
> Codd in his 1970 paper presents definition #1; I don't know where #2
> (which is surprisingly common!) comes from. Codd uses relation-valued
> attributes (RVAs) as examples of non-atomic attributes, but doesn't
> mention other kinds of non-atomic domains, as far as I can tell.
> Moreover, he presents normalisation [to 1NF] as something done for
> convenience and simplicity, since then a relation can be represented in
> storage by a 2D array. He does not claim that RVAs are theoretically
> unsound, afaik, though he says in a later paper that 1NF is "mandatory".
> If anyone has greater insight in Codd's opinions, I'd be glad to hear
> about it.
>
> In conclusion, I think definitions #1 and #2 are good database design
> advice---but informal and situation-dependent, and thus not worthy of
> the "normalisation" label.

The redefinitions of the term "normalize" make it difficult to communicate too. The industry no longer thinks we must normalize data and many, like me, don't even think that should be a goal. Functional dependencies are another matter. If we want to redefine the term normalize, we should be clear that no longer requires 1NF (formally named "normalization" by Codd). Gotta run, but this is near and dear to me right now, as evidenced in my "Is Codd Dead?" article. Cheers! --dawn

> To use lists, sets or relations as attributes
> introduces bias and asymmetry in the database design, as shown by Date
> in his writings on RVAs: Essentially, you create a hierarchical database
> if you use them (which, BTW, may help explain why some people think list
> attributes are a good idea:). This might be called 1NF anomalies,
> Marshall.
> --
> Jon
Received on Wed Mar 01 2006 - 20:05:57 CET

Original text of this message