Re: a union is always a join!

From: <compdb_at_hotmail.com>
Date: Thu, 2 Apr 2009 21:43:42 -0700 (PDT)
Message-ID: <8ad0a713-8395-432c-b58c-856a7941ac5e_at_d2g2000pra.googlegroups.com>


On Mar 24, 10:05 am, rp..._at_pcwin518.campus.tue.nl (rpost) wrote:
> >basics are relevant to my reply.
>
> Absolutely, and it is appreciated, but sometimes you stop there
> and do not address the actual issue.

I would appreciate it if you would quote an example. Because I am always addressing the issue, but maybe I am not clear
(or maybe it's the misunderstandings I'm about to address).

> Not quite. The distinction is between a priori knowledge
> and a posteriori knowledge.

You seem to think that a priori info
(operators or other constant info) is "derived". You don't seem to know that a priori info is *base* info (ie base relation variables with keys and constraints, that just don't happen to vary);
or that queries and views only rearrange base data (ie they don't add to it);
or that whether info is functionally dependent on other info is independent of whether it is constant or not.

Your thinking that a priori info is "derived" seems to be because you have only ever seen operators appear in queries. The fact that 2+2=4 is not in the varying base relations. It's not deriveable from it. Operators are base relations. Assuming + on left and right arguments
corresponds to relation PLUS on attributes LEFT and RIGHT: X+Y is

   SUM FROM (

      TUPLE FROM (
         PLUS JOIN RELATION{TUPLE{LEFT: X, RIGHT: Y>}}
   ))

There is no benefit to restricting some base relations (ie operators or other constant info) to queries. That's just the unnecessarily restricted way it's been done in SQL, and it's from not seeing that operators and relations are equivalent (and predefined operators are base relations). Which ironically was understood in PRTV which IBM idiotically chose over SEQUEL.

> Usually, relations that represent
> the former are immutable, relations that represent the latter aren't.
> One working with a database must understand how its relations
> are to be interpreted, and this distinction is important in that.
> It is good database design practice to try and keep them apart,
> by keeping derived information out of the base relations,
> putting it into queries and views. Of course this distinction
> can also be made even when D&D's algebra is used. You deny
> that the user needs to be aware of it.

I agree that users need to know what the data represents, including when it can change, what they can do to it and how it is constrained.
So I have never said that if info is functionally dependent the user shouldn't or needn't know.
I can see why you misunderstand me if you keep reading my "operators or other constant info" as "derived", and if you keep reading your own "derived" as "redundant".

You are confused over the term "derived". On the one hand you use it to refer to results of queries and views (in the sense of a relational expression result). But you consider operators to add info that isn't in the base relations.
But that means that the info is not derived (in the sense of logical consequence)
from the base relations alone, so it is not redundant. So you can't use the fact that you labelled it "derived" to justify redundancy.
You are thinking, "this attribute is the result of an operator so it's functionally dependent" without realizing that you can only say that if you treat the operator as a relation with keys.

I agree that base info that is functionally dependent on non-key attributes should be further normalized. But you are mistaken to think that
a priori vs a posteriori info is the issue. Either can involve some redundant info or not, just like any relation.
So I disagree that any notational or design distinctions or any implementation awareness
should made on the basis of a prior vs posteriori.

> >and that it is relevant to the user how any of these
> >are implemented.
>
> That, too, e.g. performance characteristics of operations
> may be important.

I agree performance is important to the user and that performance, like the rest of user-visible behaviour, is implemented somehow.
However, the user should only think about performance as part of the interpretation and not the implementation. Eg for performance,
time becomes a base relation variable updated by queries. Whether the implementation involves any tables, any calculation, any indexing, any optimization or anything else is orthogonal to notions of constant or changing info, and operators and relations are interchangeable.

> >Other
> >than what the observing (changing and constant) relations
> >represent, to the *user* it's all the same. And treating them
> >them the same eases programming.
>
> It eases programming until you wonder
> why the program doesn't work.

If you still think this after this message, I would appreciate you rephrasing this less cryptically.

philip Received on Fri Apr 03 2009 - 06:43:42 CEST

Original text of this message