Re: choice of character for relational division

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Sun, 01 Apr 2007 17:04:40 GMT
Message-ID: <ICRPh.18131$PV3.187786_at_ursa-nb00s0.nbnet.nb.ca>


Marshall wrote:

> On Apr 1, 7:28 am, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>

>>Marshall wrote:
>>
>>>On Mar 31, 2:47 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>>
>>>>One needs to have set equality and set containment operators. If one has
>>>>those, the query seems simple. What does a special operator for it buy one?
>>
>>>Do we need all of those? We need set equality for RVAs if we
>>>want to join on RVAs (which we do) but do we need a top level
>>>equality? I mean, I would expect it, but do we need it? Do we
>>>ever ask in logic whether these propositions are the same as
>>>those propositions? Doesn't seem so.
>>
>>But we do ask whether one set is equal to another set. Are you
>>constructing a calculus or an algebra?

>
> An algebra. (However the constraint language is a calculus.
> I am unclear if this is an issue. Like JOG's niggle over
> directed vs. undirected edges, I am unsure whether the thing
> to do is just "get over it.")

I would tend to use a single language for both derivation and integrity constraints. Why force anyone to learn two equivalent languages?

>>>If we have equality (=) and join (&), we don't need containment:
>>
>>>  A & B = A  <=>  A ⊆ B.
>>
>>Are you constructing an internal language or an external language.

>
> Both. Although the visible end product is an external language,
> I am pursuing the method of designing an internal language
> as the foundation for the external language, a la the highly
> recommened "Concepts Techniques and Models of
> Computer Programming", by Van Roy and Haridi.
>
> So for me, the question of "do we need X" has two forms:
> one for the internal language and one for the external language.
> The question for the internal language is of vital importance,
> because it defines the computational model, and that is
> something that needs to be as close to perfection as possible
> in the first release. You don't want to be changing the
> computational model in later releases! Contrast with the
> external language, which is built from abstraction facilities
> over the internal language that may even be visible and
> manipulable by the programmer.
>
>
>
>>I agree it is unecessary internally, but I know from past experience
>>that users are drawn to use containment when it is available to them.
>>(At least, english speaking users are.)

>
> Fair enough.
>
>
>
>>ie. SQL users internalize the IN keyword long before they comprehend the
>>EXISTS, ANY and ALL keywords.
>>
>>Then again, perhaps it is better from a pedagogic standpoint to remove
>>that crutch.

>
> Pedagogy is not a primary goal of mine. However the end design
> should be as easy to learn and as easy to use as is consistent with
> good design principles, so it's not clear if that point means
> anything.
>
>
>
>>>>Can you describe the full operation? Maybe show an inverse for a join
>>>>that is not a cartesian product?
>>
>>>SuppliesParts
>>>S P
>>>7 1
>>>7 2
>>>8 1
>>
>>>Parts
>>>P Desc
>>>1 that little thing with the knob
>>>2 the other thing we used the other day
>>
>>>Using $ for relational division
>>
>>>SuppliesParts $ Parts = {(S=7)}
>>
>>>This implies the existence of a relational remainder, in this case
>>>{(S=8, P=1)}
>>
>>If we define
>>SDP = Supplies $ Parts
>>then Supplies = SDP X Parts
>>granted the X is also join because SDP and Parts have no common attributes.
>>
>>Hmmmm....

>
> Yeah, this freaked me out once I finally understood what
> Vadim was saying.

So, a join is the inverse of a divide, but we don't necessarily have an inverse of a join. For that, we would have to specify the set of join attributes, and the result would necessarily omit any unmatched tuples that were in the original relations. Hmmmm... off the top of my head I wonder if such a thing would even have a unique solution?

>>>Also:
>>
>>>ItsLateAndICantThinkOfAnExample
>>>A B
>>>1 1
>>>1 2
>>>1 3
>>>2 11
>>
>>>ItsLateAndICantThinkOfAnExample $ (Total = sum(B)) =
>>>{(A=1, Total=6), (A=2, Total=11)}
>>
>>Consider:
>>
>>ItsEarlyAndICan
>>A    B    C
>>1    1    b
>>1    2    a
>>1    3    c
>>2    11   d
>>
>>What does "ItsEarlyAndICan $ (Total = sum(B))" evaluate to?

>
> ItsEarlyAndICan $ (Total = sum(B))
> A C Total
> 1 b 1
> 1 a 2
> 1 c 3
> 2 d 11
>
> In an aggregation expression A over a relation R, let us partition the
> attributes of R into three sets:
> 1) Those that are used as attributes to group by
> 2) Those that are used as parameters to aggregate functions
> 3) Those that are discarded
>
> In SQL GROUP BY, 1 and 2 are explicit in the expression, and 3 is
> defined by its omission. This is convenient.
>
> In relational division, 2 and 3 are explicit in the expression, and
> 1 is defined by its omission. This is inconvenient, but necessary
> to be consistent with being the inverse of join. :-(

How would one have to alter the expression above to discard C?

> So I would expect to add to the external language a construct
> similar to SQL's GROUP BY that was defined in terms of
> relational division. (Yet unlike in SQL was algebraic.)
>
> Amusingly, we convert the division into a GROUP BY by
> multiplying the dividend. Adding attributes to the dividend
> removes them from the result. We can multiply by either
> the projection of the divisor over the attributes we wish
> to remove from the result, or by the entire domain of
> the attributes we wish to remove.
>
> Or, in a third possibility, we can add those attributes as
> parameters to the aggregate function. So if we wanted
> to take ItsEarly and compute "sum(C) group by A" we would
> use an aggregate function sum', which took B as a parameter
> but didn't inspect that parameter.
>
> def sum'(B, C) = sum(B)
>
> ItsEarly $ (Total = sum'(B, C))
> A Total
> 1 6
> 2 11

That seems quite yucky from an aesthetic point of view.

> Lately I am running in to that construct: an ignored function
> parameter.
>
> f:(a, b) | forall b: forall b': f(a, b) = f(a, b')
>
> Does this have a name?

Um, in programming I would call it "highly error-correlated". HEC for short, maybe? Received on Sun Apr 01 2007 - 19:04:40 CEST

Original text of this message