Re: O'Reilly interview with Date

From: Marshall Spight <marshall.spight_at_gmail.com>
Date: 10 Aug 2005 10:02:15 -0700
Message-ID: <1123693335.220521.200510_at_g47g2000cwa.googlegroups.com>


David Cressey wrote:

>

> In this connection, I want to express a contrarian view about "natural
> joins".
> I think making joins "natural" based on common column names is a mistake.
> It's overloading the column name.
> Natural joins should be based on common domain names. Better yet, they
> should be based on declared REFERENCES constraints (perhaps unenforced).

Well, it's a tradeoff, isn't it? On the one hand, using natural join is notationally more convenient; it's shorter. On the other hard, it means you're sort of glossing over the namespace unification issue; if you didn't happen to notice that two tables have 2 column names in common instead of the one you meant to join on, your operation stil succeeds but it doesn't do what you want. This is especially an issue in maintenance; the second matching column pair could have been added after the join was written.

I don't think the "overloading" argument is quite right; you necessarily have to unify the column namespaces when you join, unless you don't want to enforce any column naming discipline at all (that is, allow columns with no names, or illegal names, or two columns with the same name) which I don't think is tenable.

The namespace unification issue matches quite well with the predicate logic model as well; when you do some logical operation on two propositions, you necessarily have to unify their predicate's namespaces.

The idea of joining only on FK or some references constraint is attractive, but I'm not sure if it's general enough. Sometimes you want to join Tables A, B, and C, where B and C both reference the primary key of A. What would that do? Or what if a table references a primary key twice?

Marshall Received on Wed Aug 10 2005 - 19:02:15 CEST

Original text of this message