Re: Informal Survey #1 -- joins on foreign keys

From: Rob <rmpsfdbs_at_gmail.com>
Date: Sat, 1 Oct 2011 14:07:31 -0700 (PDT)
Message-ID: <0d04e155-0599-4401-834c-7c8ea4f19f74_at_dm9g2000vbb.googlegroups.com>


On Oct 1, 11:09 am, Rob <rmpsf..._at_gmail.com> wrote:
> First, let me qualify my interests. I am primarily interesested in the
> mechanics of
> relational database management, not the semantics. I am interested in
> the evaluation
> of sql queries without regard to their meaning.
>
Mechanically, there is nothing to it. A join between two relations requires only
that the join attributes have identical domains. According to Codd (original 1970
paper), "We shall call a domain of relation R a foreign key if it is not the
primary key of R but its elements are values of the primary key of some
relation S." This is not exactly the same as saying that "the domain of
the foreign key is the domain of the primary key" -- given the time- varying
nature of relations, it seems to imply that the referenced value actually
exists at the time R is defined. That is somewhat bizarre.

Codd's definition also implies that if relations R and T are joined on foreign keys (one
in each of R and T), then both foreign keys have the identical domain. So again, given
the time-varying nature of relations, it must be the case that the the domains of the
foreign keys are indeed the domain of the primary key, not the sets of existing values. (There
is room for disagreement here, but this is an incidental curiosity. I think Codd's definition
of a foreign key domain is literally too restrictive.)

So, the only way foreignkey-foreignkey joins make sense is if both joining relations R and T
are defined with a foreign keys referencing the same relation S on the same pirmary key.
(This is more restrictive than saying each has the same domain as the primary key of
the referenced relation.)

If R and T are the same relation, then the join reveals a sibling relationship among
tuples of a single type. That is meaningful and might be of use in some applications.

But if R and T are different relations, then what relationship does the join reveal? It
looks a little like a "cousin" relationship, but R and T are different types whereas
cousins would be the same type. So what in general would it mean if r in R and t
in T get joined?

This is why I am asking the question in the mathematical [algebra] realm or in the
semantics realm (of E-R or facts or ?). I simply cannot figure out what the semantic
meaning is for a foreignkey-foreignkey join. Yet two responders say they've used them.
Why? What question got answered?

I've designed hundreds of databases. After contemplating it long and hard, I seriously
doubt I ever used a foreignkey-foreign key join in the queries I authored for use with
these databases. Not because it was disallowed, but because it lacks meaning. So if others have used these ("no big deal", "Obviously, I have"), then what
was the meaning in the context of the databases where it was used?

Any help here is geniunely appreciated.

Rob

NOTE1: If R and T both reference S, then a R-S-T select query might get optimized
to exclude S if no non-key attributes of S are included in either the Select
subclause or the Where subclause. In that case, the optimizer would be mechanically transforming the R-S-T query into a R-T query. That doesn't address the higher-level issue of what an R-T query means.

NOTE2: (With apologies to those who discredit E-R modeling.) If R and T both reference S, then in the E-R schema graph, there will be an edge between R and S and between T and S, but not between R and T. Edges correspond to causal relationships and joins between connected entities reveal these relationships. Joins between unconnected
relationships correspond to coincidental relationships ("Give me all suppliers and projects located in the same city.") But detection of coincidences based on non-key attributes (i.e., city) is not quite the same as coincidences based on foreign keys. Puzzling.

NOTE3: (to Cimode) If I am designing an optimer, I could include optimization of queries that include foreignkey-foreignkey joins. But if such queries never get asked, why bother? Your inference that I don't understand meaning is unfounded. I just can't find meaning in these particular foreignkey-foreignkey joins. Received on Sat Oct 01 2011 - 23:07:31 CEST

Original text of this message