Re: Normalisation

From: Jan Hidders <>
Date: Tue, 05 Jul 2005 19:51:11 GMT
Message-ID: <PwBye.138032$>

Jon Heggland wrote:
> In article <oRfye.137238$>,
> says...

>>>>Roughly, I would say that it means that all your data manipulation can 
>>>>be expressed as operations over a signature that consists only of (1) 
>>>>domains described by an abstract data type and (2) relations that 
>>>>contain in their fields only values from these domains.
>>>I can't make heads or tails of this. Operations over a signature?
>>Yes, that means that they can be described in a programming language in 
>>which your access to the database is restriced to the functions in the 
>>signature. These operations would be the (1) the operations that are 
>>associated with the domains and (2) operations that retrieve the current 
>>instance of a relvar.

> How does a set domain (e.g. the domain of sets of integers) violate
> this? For definiteness, let's associate the normal set operations with
> it---union, intersection, subset, cardinality and so on.

By itself it doesn't. Until you start introducing operations that treat it as non-atomic. Then it does.

> (By the way, is you phrase "domains described by an abstract data type"
> significant? If so, can you explain what it means? Are you talking about
> type generators like set, array, list and so on?)

It's only there to emphasize that the only allowed access is through the functions defined for it and that you cannot access the internal implementation / representation. I believe that in programming language circles this is not always assumed.

>>>>An unnest operation, for example, cannot be described in that way.
>>>What if I define an unnest operator that "unnests" strings in a 
>>>corresponding manner?
>>It does not satisfy the definition. 

> Does that mean that I should not be allowed to define such an operator?

No, just that the fields that can be unnested with that operator cannot be considered atomic anymore.

> Or that string are atomic and should not be normalised unless and until
> I define an unnest_string operator, in which case all my relvars with
> strings suddenly are no longer in 1NF?

Yes. Although the "should not be normalized" is a bit odd since the whole point is that the defintion says that in that case they already are normalized.

> Does it matter whether my unnest_string is system defined (by the DBMS)
> or user-defined?

Yes. Note btw. that if user-defined functions are restricted to the domains (their input and output types are only domains) then you cannot define such an operation as a user-defined function.

>>>Anyway, let me rephrase my initial question.
>>>What is an atomic datatype/domain?
>>The problem with that question is that it makes an incorrect assumption. 
>>Atomicity is not a property of a domain per se,

> That is news to me. All database textbooks I have used use the phrase
> "atomic value" to define 1NF. Which, while obviously different from
> "atomic domain", can hardly be interpreted very differently.

Absolutely, the point of the definition is to make more precise what it means that the values in a domain are treated as atomic values.

> How does that help with normalisation?

Since it defines atomicity it tells you when you are in 1NF or not. Just to be clear on this, I regard this discussion separate from the question whether you actually *should* be in 1NF or not. All I'm arguing here is that it is not a meaningless concept and can be defined reasonably well.

> And how can you say a priori that
> a relvar with a set-valued attribute is not in 1NF, if that depends on
> the operators of a particular DBMS?

You cannot. Isn't that a big problem? I don't think so. To understand why it is important to get a good feeling of which operations break the atomicity of certain domains. Also consider the following. Suppose a new user-defined domain is added. If there is no operation defined on it that exposes its internals by mapping it to a relation, then even an unnest operation cannot break its atomicity.

  • Jan Hidders
Received on Tue Jul 05 2005 - 21:51:11 CEST

Original text of this message