Re: Normalisation

From: Jon Heggland <>
Date: Tue, 5 Jul 2005 12:41:16 +0200
Message-ID: <>

In article <anfye.157493$El.71359_at_pd7tw1no>, says...
> i may look foolish for jumping into this thread so late, but here's what
> i think anyway. it just seems to me that the RM itself defines very
> little which leaves many things it doesn't define. so my knee-jerk
> reaction is to say that what's not in the RM shouldn't be implemented in
> the RM part of a RDBMS. i'd say it must be understood that when people
> talk of atomic things, they mean atomic as far as the RM is concerned,
> not necessarily as far as some other definition or concept is concerned,
> such as a postal address or infinity. so there has to be some
> 'conceptual separation' that is preserved once we try to 'get physical'
> (i'm tempted to underline the word 'separation', but i don't know how!)

I think this is about the same sentiment as saying that domains are orthogonal to the RM.

> based on this, i'd say that an 'atomic datatype/domain' is one whose
> values the RM is able to relate (ie. perform whatever operations are
> defined to be relational) to values of the domain or other domains
> without knowledge of the structure and operations of the domains
> involved. in this sense, the domain is a kind of 'black box' to the RM.

Yes. The only thing a RDBMS requires of a domain, is that it must be possible to ascertain whether two values are equal / the same. It doesn't need to know exactly how this is done.

> in practice, there must be some agreed protocol between the RM and the
> party that does have such knowledge. that party could be machine code
> or human interaction, although i think most people would expect it to be
> code.

I'd say that "knowledge" is encoded in the definition of a domain and the operators that operate on values from that domain. These can be user-defined or system-defined, and the RM as such does not care. Even if a particular RDBMS out of the box knows how to compare two integers, that knowledge isn't part of the RM as such.

> i'd say this
> means that the RM cannot itself even decide equality. it should
> theoretically always have to inquire of some outside party whether two
> values are 'equal'.

Yes. That outside party is the definition of the equality operator for the domain in question.

> from an implementation point-of-view, this could mean that domain
> support might make use of the RM without the original RDBMS even
> 'knowing' about it.

I'm not quite sure how to parse this, but I choose to interpret it as "RDBMSs should be able to support arbitrary user-defined domains---and it shouldn't be hard to do either". :)

> without being deeply familiar with any so-called RDBMS's, i'll stick my
> neck out and state that most implementations have probably made the
> mistake of not separating the relational operators from those that are
> peculiar to a domain.

Well, I'd say the mistake (or one of them) is poor domain support. I'd also say that relational operators *are* peculiar to a domain---the domain of relations. (Or to be more precise, "relation" is a type generator, and the relational operators are generic operators.)

> > What is a non-atomic datatype/domain?
> wouldn't it be one that the RM part of the RDBMS is able to encode or
> decipher or somehow apply without help?

Wouldn't that be the the same as a system-defined domain? "Integer" would fit that definition in most (all?) implementations, and most people who use the word "atomic" certainly consider integers atomic.

> > What bad happens if you allow a non-atomic datatype in a relation?
> i think this question really means what bad thing happens if a relation
> has a datatype that the RM doesn't have some protocol for presenting to
> another party.

I'd say that just means that all domains must be well-defined. In the case of relation-valued attributes, the RM/RDBMS certainly has a protocol for handling relations.

Received on Tue Jul 05 2005 - 12:41:16 CEST

Original text of this message