Re: Informal Survey #1 -- joins on foreign keys

From: <compdb_at_hotmail.com>
Date: Sat, 5 Nov 2011 17:07:26 -0700 (PDT)

On Oct 1, 10:09 am, Rob <rmpsf..._at_gmail.com> wrote:
> is a join
> between two relations on a foreign key in each ever meaningful?

On Oct 2, 10:58 am, Rob <rmpsf..._at_gmail.com> wrote:
> What I'm really trying to get at is whether a foreignkey-foreignkey
> join makes sense algebraically, and if so, do operations in higher-
> level
> abstractions (like E-R, facts) translate to them?

Rob,

I will enter this thread as I do most: you really don't understand the relational model.

1.
(I quote Erwin for his versions of things I am saying.) On Oct 3, 4:27 am, Erwin <e.sm..._at_myonline.be> wrote:
> Relvars are associated with an
> external predicate, which "documents" the "meaning" that is
> represented by [the tuples in] it.

Codd 1970: "The meaning of [predicate logic predicate expression] COMPONENT(X, Y, Z) [given ordered-tuple set mathematics 'relation' COMPONENT] is that part X is an immediate component (or subassembly) of part Y, and Z units of part X are needed to assemble one unit of part Y". Such a parameterized statement is a 'predicate'. Codd meant that for modern relation (his "relationship") "the predicate for COMPONENT is that part SUB-PART is an immediate component (or subassembly) of part SUPER-PART, and QUANTITY units of part SUB-PART are needed to assemble one unit of part SUPER-PART". And that we might want to write "COMPONENT(X, Y, Z)" for "<SUB-PART X, SUPER-PART Y, QUANTITY Z> IN COMPONENT". When you give values for X, Y and Z or SUB.PART, SUPER.PART and QUANTITY you get a 'proposition', which is a statement that either holds or does not hold in a given world situation. ('Has truth value' 'TRUE' or 'FALSE'.)

The (relational) extension of a predicate is the relation whose body is the set of tuples that make it into a proposition that holds in a given world situation. So we say (informally) that a relation is the extension of a predicate. A relation's attributes are the parameters of its predicate.

2.
Each query relation expression has a predicate as follows. (This correspondence is typically described as "informal" (and spoken of vaguely if at all) but that is incorrect.)

On Oct 3, 4:27 am, Erwin <e.sm..._at_myonline.be> wrote:
> For example, if we have relvars R1 and R2 with predicates P(R1) and
> P(R2), respectively, then the external predicate associated with R1
> INTERSECT R2 is "P(R1) AND P(R2)".

The predicate of a relation expression that is a name of a relation variable or constant is its predicate. The predicate of a relation expression that is a JOIN is the AND of the predicates its operands; of a UNION is the OR; of a MINUS is the AND NOT; of a RESTRICT X=Y or of an ADD X AS Y is the AND X=Y; and of a PROJECTALLBUT X is the EXISTS X. The value of a query relation expression that is a relation name is its named relation's predicate's extension. Each relation operator is defined so that if its operand values are the extensions of their predicates then its result is the extension of its predicate. (As can be shown. Codd and (ADD) Hall et all defined them so this would be the case.) So by induction every query relation expression value is the extension of its predicate.

So the predicate and result value of a query relation expression is independent of whether any normalization has been done or constraints have been defined.

3.
Given the named relations' predicates and possible world situations, of the syntactically typed database values certain ('valid') values could arise and other ('invalid') ones definitely do not. The DBMS evaluates a dba-given overall database constraint expression and allows an assignment if and only if a proposed database value is valid. (This involves further relation operators for equality and nesting/aggregation.) So for values that the user got from (correctly) evaluating the predicates on every tuple for the world situation (all valid by definition), by this policy none is excluded and all are included.

On Oct 3, 4:27 am, Erwin <e.sm..._at_myonline.be> wrote:
> You've got things the wrong way round.

The constraint expressions tell the user something if and only if the user doesn't know all the possible world situations. But a query predicate only ever depends on its relation expression and the named relations' predicates and its result value only further on the named relations' values. Neither depend on constraint expressions. (Of course, they are correlated with them.) Constraint expressions constrain updates. They do not constrain valid database values or affect query predicates or result values. If the user never made a mistake (incorrectly including/excluding tuples contrary to their named relations' predicates and the world situation) they would not need constraint expressions.

Of course, the constraint expressions are also telling or confirming to the user important truths about the possible world situations in terms of the named relations' predicates; and can help the user understand those predicates and the world; and can help the user to rephrase queries and the DBMS to optimize queries. A constraint expression just expresses a truth.

4.
Modern normalization/dependency theory says that there is a foreign key on attribute set K from a target D into a target F (domestic/ foreign) in a database when in all possible world situations ie in all valid database states K forms a key in F and all the subtuple values for K in D are also in F. (We can also speak of it being a foreign key in a given situation.)

So there is a foreign key constraint on K from D to F if and only if certain expressions express certain things that hold in every world situation. This could happen for any two database named relations; it's just a relationship that happens to always hold for pairs of values of D and F in that database. And just like every constraint expression's proposition, the only way in which this has anything to do with a query predicate or result value is that the possible world situations and the named relations' predicates collectively determine them all.

5.
The "relationship" with predicate "there is a foreign key on attribute set {...} from named relation R and T to named relation S" is a fact that either holds or does not hold. As a predicate, it is a proposition. It is a relationship on nothing. Its extension ie relation is DEE. In practice we don't mean a fact when we say "relationship". (Historically, TRUE and FALSE aren't even wffs.)

You seem interested in the constraint-oriented "relationship" FK on K, D and F (in that order in wffs) with predicate "there is a foreign key on attribute set K from named relation D to named relation F". Note that D and F denote the names of named relations, not relation values. Note also that a foreign key relationship holds on a K, D and F when a certain thing is the case for all valid database values. You seem interested in constraint propositions on expression "S JOIN T" that can be inferred from certain constraint propositions on S and T when EXISTS X, SR, ST: FK(X, R, SR) AND FK(X, T, ST). Ie when the proposition "there is set X such that a foreign key on attribute set X from named relations R and T to named relations SR and ST respectively" holds.

We can have foreign keys on arbitrary expressions. So we can have FKe with predicate "there is a foreign key on attribute set K from expression D to expression F". It happens to be the case that that FKe(k, "R", "SR") AND FKe(k, "T", "ST") implies FKe(k, "R JOIN T", "SR"). Also that FKe(k, "R JOIN T", "ST") . Also that R{K} SUBSETOF (R JOIN T){K}, and T{K} SUBSETOF (R JOIN T){K}. And lots of other things. (But not "relationships".) Though I suspect you are interested in the first two. (Do you care whether both foreign keys are to be to the same target? They needn't.) Note that FKe is the relationship; FKe(k, "R JOIN T", "S") is a fact that holds when Fe(k, "R", "S") AND Fe(k, "T", "S") holds.

But a JOIN does not "reveal" any of this. Evaluating S JOIN T for a particular world situation just tells you certain things are true of that world situation per the predicates of S and T. The predicates and facts above don't even involve the same world as the JOIN and named relation predicates. It is the properties of FK and JOIN that have relevant consequences.

Nothing "reveals" a relationship. It is meaningless to talk about "the" relationship between some attributes/parameters. A predicate or an extension/relation each tell you something about a relationship independent of a world situation. You are not talking about a particular relationship unless you have both its predicate and a world situation or its predicate and its extension. Do not confuse relationships, predicates, extensions/relations and propositions. Don't even use the word 'relationship'.)

6.
So every JOIN just says that you want the tuples that make two other predicates true at the same time. Regardless of constraint expressions. Every relation expression is "meaningful". And you are probably actually interested in constraint (proposition) inference.

So most of the sentences in your posts don't make sense. You are not clear and you don't seem to have the relational understanding or habits of thinking and writing to characterize your problem.

Working through any example you choose might help make this clear to you. You might be interested in message
http://groups.google.com/group/comp.databases.theory/msg/1963ce6c0d2a603a?hl=en . Also give up Codd, especially 40 years ago. (Which you misread, although you're right about a certain unnecessary restriction.) Read recent Date and Darwen.

7.
On Oct 3, 11:05 am, Rob <rmpsf..._at_gmail.com> wrote:
> On Oct 3, 4:27 am, Erwin <e.sm..._at_myonline.be> wrote:
>> [...]
> abstractions can map to this questionable construction.

Your reply to Erwin basically totally misunderstands him. And his post was entirely on-topic.

On Oct 2, 9:07 am, paul c <toledobythe..._at_gmail.com> wrote:
> On Oct 1, 2:07 pm, Rob <rmpsf..._at_gmail.com> wrote:

> There must be thousands of db's that have Shipments, Invoices and
> Receivables tables with a foreign key referencing a Customer table.
> Obviously there will be people who will want to join two or more of
> those tables to compare Shipment_Value to Invoice_Amount or
> Receivable_Amount.
>
> Of course there are probably thousands, maybe millions, of people who
> have designed hundreds of databases that used no such join. And,
> probably thousands of people who have designed dozens of databases
> that used only such joins.

I believe Paul's point was not about gut reactions but that of course such JOINs are meaningful.

On Oct 1, 2:07 pm, Rob <rmpsf..._at_gmail.com> wrote:
> Any help here is genuinely appreciated.

I believe you mean this, but I don't believe you know what it means. Read and reread carefully. Think and write and rethink and rewrite carefully. Don't disagree. Ask people to clarify what they mean. Every time you disagreed you were wrong, and every time you agreed you misunderstood. When world views collide, resolve contradictions, doubt confirmations.

Thank you for the opportunity to clarify my thoughts and their expression. (Ten minutes well spent.) (Joke.) (The ten minutes.)

philip Received on Sat Nov 05 2011 - 19:07:26 CDT

Original text of this message