Re: The Practical Benefits of the Relational Model

From: Paul <pbrazier_at_cosmos-uk.co.uk>
Date: 23 Oct 2002 06:13:12 -0700
Message-ID: <51d64140.0210230513.616d3d13_at_posting.google.com>


"Paul Vernon" <paul.vernon_at_ukk.ibmm.comm> wrote in message news:<ap3vcg$1lqo$1_at_sp15at20.hursley.ibm.com>...
> Two tuples can only be identical if they have the same set of <attribute name,
> datatype> pairs.

OK I think the confusing part to me is the difference between the mathematical and the relational jargon. A mathematical relation is a subset of the cross product of various domains, so in a sense the equivalent of a relation's columns are indexed by the natural numbers i.e. the tuple ("Fred", 23, "British") is unambiguous - "Fred" is value 1, 23 is value 2, British is value 3. You could write the tuple as a set of ordered pairs:
{(1, "Fred"), (2, 23), (3, "British")}

In relational theory the column order we are defining our own "column indexing" set to refer to the columns so we could think of the tuple as the set:
{(Name, "Fred"), (Age, 23), (Nationality, "British")}

But I don't think to the database it matters whether the columns are explicitly called Name, Age and Nationality or implicitly called 1, 2 & 3. It might be purer to use the relational way actually because it doesn't require an order on the "column indexing" set.

> If you don't match on attribute names then with these relvars
>
> CREATE TABLE R (A INTEGER, B INTEGER);
> CREATE TABLE S (C INTEGER, D INTEGER);
> CREATE TABLE T (C INTEGER, D INTEGER);
>
> does this relation value
> R := VALUES(1, 2)
>
> equal this relation value?
> S := VALUES (1, 2)
>
> and does it equal this relation value?
> T := VALUES (2, 1)
I think this last one is the problem: the syntax is assuming an order to the columns which doesn't exist unless you write something like: INSERT INTO T (C,D) VALUES (2,1)  
> > Maybe there is a need to distinguish between equality and identity
> > here i.e. two relations can be "equal" for a database snapshot but not
> > "identical" or tautologically equivalent for all possible values of
> > the database?
>
> No.

You could think of a database as a theory itself with axioms etc. in the same way as set theory or something. Kind of a meta-theory where each predicate or tuple in each relation is an axiom in our new system.

So the tuple ("Fred", "UK") in the relation "Birthplaces" is an axiom "Fred was born in the UK". Similarly ("Fred", "UK") in the relation "CountryOfResidence" is an axiom "Fred lives in the UK". These two relation could even have the same domains and column names. To us the relation names tell us the different meanings we can get from them - the verb in the axioms.

The base tables you could think of as containing tuples which are axioms.
Tuples in views are theories derivable from the axioms. Normalisation just means that the axioms are independent of each other.
Consistency of the axioms means that you couldn't have contradictory data in your database e.g. as in the LOVES and HATES example on the dbdebunk website. Because otherwise you could prove that true = false and thus everything collapses.

The database could never stop me putting in ("Fred", "USA") into the Birthplaces relation even if it were untrue in reality because this is an axiom in our system.

But it could stop me putting "Adam hates Eve" and "Adam loves Eve" in relations if I add the constraint that these two predicates are inconsistent (are constraints predicates just as tuples are? - maybe a different type of predicate e.g. one with a "for all" quantifier?).

So you have your low-level inference rules from the relational theory but also high-level inference rules that have to be explicitly stored in the database as constraints.

Consider thsese tables:

CREATE TABLE BirthPlaces (Name namedomain, Country countrydomain) CREATE TABLE BirthPlaces2 (Name namedomain, Country countrydomain) CREATE TABLE BirthPlaces3 (Country countrydomain, Name namedomain) CREATE TABLE Resident (Name namedomain, Country countrydomain)

INSERT INTO BirthPlaces (Name, Country) VALUES ('Fred', 'UK') INSERT INTO BirthPlaces2 (Name, Country) VALUES ('Fred', 'UK') INSERT INTO BirthPlaces3 (Country, Name) VALUES ('UK', 'Fred') INSERT INTO Resident (Name, Country) VALUES ('Fred', 'UK')

We can guess that the first three relations are identical but the fourth just happens to be the same at the moment. But would the DBMS think that all four are equal because they have the same set of <attribute name, datatype> pairs? Or would it think they all are different because they all have different names and there are no constraints defined to say they are always equal?

I think this comes round to another existing thread about column names being unique - you could regard the column names as just "Birthplaces.Name", "BirthPlaces.Country" (and the relation nameless) instead of "Name", "Country" (and the relation called BirthPlaces). This would then start to free you from the table-column hierarchy (looks kind of like OO notation - very suspicious :)) Then all columns are uniquely named in the database and you can never have two different relations that according to set theory are actually identical.

Paul. Received on Wed Oct 23 2002 - 15:13:12 CEST

Original text of this message