Re: a union is always a join!

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sat, 14 Mar 2009 17:39:14 -0400
Message-ID: <9uVul.22264$Ws1.16416_at_nlpi064.nbdc.sbc.com>


"paul c" <toledobythesea_at_oohay.ac> wrote in message news:wJDul.17957$PH1.5324_at_edtnps82...

> Brian Selzer wrote:
>> "paul c" <toledobythesea_at_oohay.ac> wrote in message 
>> news:ZXlul.17783$PH1.16918_at_edtnps82...
>>> Brian Selzer wrote:

>>>> "Walter Mitty" <wamitty_at_verizon.net> wrote in message
>>>> news:apltl.2309$%u5.1252_at_nwrddc01.gnilink.net...
>>>>> "Brian Selzer" <brian_at_selzer-software.com> wrote in message 
>>>>> news:eY2tl.9205$%54.7793_at_nlpi070.nbdc.sbc.com...
>>>>>> "paul c" <toledobythesea_at_oohay.ac> wrote in message 
>>>>>> news:beZsl.15959$Db2.2243_at_edtnps83...
>>>>>>> Walter Mitty wrote:
>>>>>>>
>>>>>>>> ...  I'm also going to suggest that what
>>>>>>>> Brain S. calls "oversimplification" is almost exactly what others 
>>>>>>>> call
>>>>>>>> "abstraction".  I'm also going to suggest that without abstraction 
>>>>>>>> you don't
>>>>>>>> get any independence, and without independence, you don't get much 
>>>>>>>> of any
>>>>>>>> bang for the buck.  That may be of zero theoretical importance, but 
>>>>>>>> it's of
>>>>>>>> interest to me.
>>>>>>>> ...
>>>>>>> Walter, I'm with the many people who think phyaical and logical 
>>>>>>> independence are of high importance, both theoretically and 
>>>>>>> practically. But I'd say many of the nuances and implications of 
>>>>>>> those haven't been explored much in print.  Brain S as you call him 
>>>>>>> regularly enters the realm of mysticism.  I point this out not to 
>>>>>>> correct him, but to warn newcomers here that he is not exactly 
>>>>>>> swimming in the main stream of relational theory (to be fair, not 
>>>>>>> many are, because the theory is often confused with past practice). 
>>>>>>> I have a number of mystic acquaintances and I like them all, partly 
>>>>>>> because they don't involve themselves in db theory and there is much 
>>>>>>> in life for which mysticism offers the only comfortable clues.
>>>>>>>
>>>>>> Mysticism.  If accepting that the universe of discourse contains 
>>>>>> things and that at different times a thing can differ in appearance 
>>>>>> yet still be the same thing means that I'm a mystic, then I'm guilty 
>>>>>> as charged.
>>>>> What difference does it make whether it's the same thing or a 
>>>>> different thing?

>>>> If an employee worked 50 hours on a project and his labor rate is $20
>>>> per hour, then it cost $1000 to have him work on the project, right?
>>>> WRONG! The employee's labor rate /is/ $20 per hour, but that doesn't
>>>> mean that it /had been/ $20 per hour during the time that he worked on
>>>> the project. At that time his labor rate might have been $18 per hour
>>>> or may even have changed part way through the project. So the record
>>>> of cost must not contain just which project, which employee and how
>>>> many hours, but also at which labor rate or rates the work was
>>>> performed. But the employee is still the same employee even though his
>>>> labor rate changed from $18 to $20. Other cost records may exist for
>>>> projects that he worked on after the rate increase, and one should
>>>> expect that a query of which projects he worked on would return all of
>>>> the projects, regardless of the labor rate.
>>>>

>>>> So something can appear different at different times yet still be the
>>>> same thing.
>>>>

>>>> This poses a problem because keys are not necessarily permanent
>>>> identifiers. (I'm having trouble articulating my thought here because
>>>> there is more than one usage of the term, "key." I'm disinclined from
>>>> using "key value" because under an interpretation, a key value is a
>>>> mapping to a particular thing in the universe, that thing being the
>>>> output of the valuation function for the set of symbols for the
>>>> components in a tuple of the set of attributes that is the candidate
>>>> key, and it's possible for that same set of symbols to map to different
>>>> things at different times, or for different sets of symbols to map to
>>>> the same thing at different times. But it's unwieldy to say "sets of
>>>> symbols for the components in a tuple of the set of attributes that is
>>>> the candidate key" instead of just "keys.") The problem stems from how
>>>> things in the universe of discourse are identified, and that the scope
>>>> of the definition of a candidate key is any database and not all
>>>> databases. While a key may uniquely identify something in the context
>>>> of its containing database, that doesn't necessarily mean that that
>>>> same key uniquely identifies that same something at all databases in
>>>> which it appears.
>>> I wish, at least once, you would give an answer that was shorter than 
>>> the question.
>>
>> Ask me a question that has a simple answer, and I'll simply answer it.
>>
>>
>
> That's a cute riposte in that it grants my wish as far as my last question 
> is concerned.  But how about the simple answer to Walter M's question 
> (which is "none", ie., the attributes that are chosen for relations 
> determine the consequences)?

As my voluminous reply indicated, I don't think that it is "none."

> The example of the employee whose hourly cost changes is bogus because it 
> confuses employee cost with project hourly costs, obviously the latter 
> would be an attribute of some project relation in any workable system.
>

But it is clear that at each interval during which the employee was working on the project, the employee's hourly cost and the project's hourly cost (at least as far as the employee was concerned) were identical. That fact cannot be denied even though the database doesn't maintain an explicit record of the employee's rate changes.

>
> One of the flaws of the mystic persuasion as far as db's are concerned and 
> as we see it in your posts, is that it denies, in what usually appears to 
> me to be in a willful and haphazard way, that mechanical db's, so far in 
> history, don't actually relect reality, only an abstraction of reality. 
> This has got to be understood in any mention of 'interpretation'.  At some 
> point maybe you will come to see that.

Abstraction is a good thing. I don't deny it. The universe of discourse, or as Codd put it, "the micro-world that the database is supposed to represent," for most if not all databases is itself an abstraction of just a subset of reality. But what you appear to be trying to do is apply mechanisms that only work for static mathematical objects to things that can change over time. That's not abstraction: that's just illogical.

There is a huge difference between a relation for an operator defined on a domain of mathematical objects and a relation defined on a domain of things that that can change over time. In particular, there can only ever be one extension of the relation for the operator, whereas there are as many possible extensions of the other as there are legal combinations of tuples. The relation for the operator is true at all possible worlds at all times under all interpretations, so the mechanism of its interpretation is moot since the outcome is always the same. But for things that can change over time, the mechanism of interpretation becomes critical because whether or not a tuple appears in a relation depends solely upon whether the assertion it represents has been assigned a positive truth value under an interpretation.

> I wouldn't criticize if you could describe a formal model that could 
> embody the very extraneous notions you bring up, but the usual assumption 
> of any reader here is that the RM is the starting point but your starting 
> point doesn't which makes it very hard for any reader to guess what the 
> dickens your context is.  Nothing wrong with additional abstractions 
> beyond Codd's, as long as the perpretators recognize that they need to 
> explain them to the rest of us.

There really isn't room here for a detailed explanation, but perhaps what follows will at least clarify what my context is.

The way I see it, the Relational Model is equivalent to a formal logical system based on a first-order modal tense logic. Modal because the set of all domain constraints, relation constraints and database constraints together specifies the set of all possible databases, which is the equivalent of the set of all possible worlds, and tense because a database is the equivalent of an assertion that states not just what is the case but rather what has been the case since the last update, and a transition is the equivalent of an assertion that states in the context of what has been the case (or more precisely, what had been the case during the interval from the last update up to this point) what is different and exactly how.

The simple terms of a formal language of that system include, like any formal first-order language, a set of individual names, a set of individual variables, and a set of relation names of various degrees. An atomic formula is of the form P(x1,...,xn) where P is a relation name and (x1,...,xn) are a set of zero or more individual variables. Complex formulae are formed by combining atomic formulae with logical operators, connectives and quantifiers. Constraints are sentences (closed formulae) that together specify which models are legal under the intended interpretation. A model is an extension of each formula in each possible world, a mapping of each term to something in the universe of discourse, and a mapping of each formula in each extension to a truth value, which as a consequence states which member of the set of all possible worlds is the actual world. Constraints fall into four categories: a set of named constraints partitions the set of individual names; another set of constraints specifies the set of all legal extensions for each formula; a third set specifies the legal combinations of extensions that together constitute the set of all possible worlds, and a fourth set defines which possible worlds are accessible from another. Under the Unique Name and Closed World Assumptions, these sets of constraints are the equivalents of domain definitions, relation constraints, database constraints and transition constraints in the Relational Model.

If you're interested in other abstractions beyond Codd's, you might want to investigate Edward Zalta's theory of abstract objects. In particular, his paper "The Modal Object Calculus and its Interpretation," published in /Advances in Intensional Logic/, 1996, describes in detail the mechanism of interpretation--including the assignment of meaning to terms in the formal language and the assignment of truth values to formulae. Received on Sat Mar 14 2009 - 22:39:14 CET

Original text of this message