Re: RM and definition of relations/tuples

From: paul c <toledobythesea_at_oohay.ac>
Date: Wed, 23 Nov 2005 18:06:22 GMT
Message-ID: <yc2hf.582465$1i.48242_at_pd7tw2no>


Martin Zapf wrote:
> I have a question to the Relational Model and the definition of
> relations and tuples.
>
>
> I learned the following definitions:
>
> A relation schema R is a set of Attributes R={A_1,A_2,...,A_n}
> Each Attribute A has a domain dom(A)
>
> Here comes the problem, there are two definitions for relations/tuples
>
> 1. defintion:
>
> A relation r for schema R is a mathematical relation (cartesian
> product) over the
> domains from the attributes of R.
> So r:=dom(A_1)xdom(A_2)x...xdom(A_n)
> A tuple is an element of r.
>
> 2. definition
>
> A relation r for schem R is a set of tuples.
> A tuple t is a function
> t: R -> Union (dom(A_1),dom(A_2),...,dom(A_1))
> t maps each Attribute of R to an value of its domain.
> So a relation is a set of functions, which are called tuples.
>
>
> I noticed, that the difference between this two definition is that
> definition 1 forces a certain order for the Attributes and the values
> for them in the tuples.
> The 2. definition doesnt need any order for the attributes.
>
>
> Im quite confussed, is there a "better" definition or should you always
> use both?

I don't quite understand definition 2, however that may be just my own simple-mindedness. As for definition 1, I don't see that it 'forces a certain order', eg., surely "r:=dom(A_1)xdom(A_2)x...xdom(A_n)" doesn't suggest a different relation than "R:=dom(A_n)x ... x dom(A_2) x dom(A_1)". If an order were implied wouldn't that suggest that the facts might be different if we chose to read the relation in a different order? That could be quite a mess if different audiences had different spoken tongues. Just because there are different ways to write an answer doesn't mean two different ways stand for two different answers.

Instead of merely calling it an 'element of R', some people (such as Darwen and Date) define tuple as a set of ordered triples, eg., {<Aname1, Adomain1, value1>, <Aname2, Adomain2, value2> ...} , so it is clear that no matter what order the tuples are written, two tuples with different ordering of the triples could stand for the same fact.

(D & D are a little more precise than I am, good thing too - they define a heading and a body and stipulate exactly one triple per tuple for each attribute of R and then prefer to say a tuple is an element of the body of R. It seems that they manage to avoid bringing the word "function" per se into the definition, although I guess one might say that "exactly one triple per tuple for each attribute" expresses a function. Note I'm not suggesting that functions don't have a place, just that D&D manage to keep the vocabulary of their definitions small.)

Don't know if that helps. If I may veer a little off-topic, what I like about their definition is that it allows a relation that has no attributes, although I read somewhere that Codd saw no use for such a relation. Such a relation can have only two values, TRUE and FALSE, if one chooses, and one tuple if the value is TRUE, namely the empty set. This seems useful to me, because it allows D&D to reduce Codd's operators to a much smaller set (eg. join and intersection and product are fundamentally the same operator) and also builds in the ability to answer simple questions like "is this table empty?" without inventing strange new keywords.

Another consequence is that an attribute must have a name (which Codd apparently didn't conclude in his original paper) as well as a domain. I've also read somewhere that the SQL designers didn't think attribute names were very necessary and that this has caused quite a few complications, but not knowing much about SQL, I can't talk about this.

p Received on Wed Nov 23 2005 - 19:06:22 CET

Original text of this message