Re: O'Reilly interview with Date

From: Tony Andrews <andrewst_at_onetel.com>
Date: 12 Aug 2005 08:47:40 -0700
Message-ID: <1123861660.527338.95120_at_o13g2000cwo.googlegroups.com>


Kenneth Downs wrote:
> First is the dogmatic answer. The dogmatic answer is that well-designed
> tables will always join by keys. The key (or keys) defines the only
> meaningful linking of the two tables in normal operations. Therefore you
> specify the key because that is how they join, it is actually unnatural and
> downright incorrect to specify columns.
>
> So for me personally if a small tight group can control the table design,
> then you get well-behaved systems and we can afford, in those situations,
> to be dogmatic.
>
> But SQL, in giving us the necessary flexibility to JOIN outside of a key,
> actually makes it seem through years of use that columns are the natural
> way to join tables, when in fact they are not, they are only an expedient.

I think you have it backwards here: it is the fact that SQL (or rather current commercial SQL DBMSs) only support simple inter-table constraints (i.e. foreign keys) and not complex inter-table constraints (like "CHECK (emp.sal BETWEEN dept.min_sal AND dept.max_sal)") that leads you to believe that these are the only valid way to join. It is the most common ("natural") way to join, sure, but there are others.

> > What if you have a non-standard join? For example start_date and
> > end_date columns and you want to join where some date is between these
> > two?
>
> This is not actually non-standard. It only appears so because there is no
> real support for intervals and so we all code intervals manually. But if
> start_date and end_date columns were defined as an interval, then comparing
> the interval to a single date implies "between", just as a unique
> constraint on the interval implies "does not overlap", and the interval
> definition itself generates the constraint end_date >= start_date.

You have posted similar ideas before. You are "making up your own rules" here, and while it may be useful in some ways, it is not coherent IMHO. Sure, you can define an operator DATE_IN_INTERVAL(date,interval) that returns TRUE if the date falls within the interval and FALSE if not - but that wouldn't be the basis for a "natural join" that could be inferred by the DBMS. The date is not "equal to" the interval.

And the idea that a UNIQUE constraint on an interval would mean "no overlap" is totally unacceptable! Suppose you have a requirement to store intervals that MAY overlap but MUST NOT be duplicated - how will you enforce that? UNIQUE should mean unique, no more and no less! Received on Fri Aug 12 2005 - 17:47:40 CEST

Original text of this message