Re: choice of character for relational division

From: Marshall <marshall.spight_at_gmail.com>
Date: 1 Apr 2007 09:07:07 -0700
Message-ID: <1175443627.642281.162900_at_e65g2000hsc.googlegroups.com>


On Apr 1, 7:28 am, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
> Marshall wrote:
> > On Mar 31, 2:47 pm, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>
> >>One needs to have set equality and set containment operators. If one has
> >>those, the query seems simple. What does a special operator for it buy one?
>
> > Do we need all of those? We need set equality for RVAs if we
> > want to join on RVAs (which we do) but do we need a top level
> > equality? I mean, I would expect it, but do we need it? Do we
> > ever ask in logic whether these propositions are the same as
> > those propositions? Doesn't seem so.
>
> But we do ask whether one set is equal to another set. Are you
> constructing a calculus or an algebra?

An algebra. (However the constraint language is a calculus. I am unclear if this is an issue. Like JOG's niggle over directed vs. undirected edges, I am unsure whether the thing to do is just "get over it.")

> > If we have equality (=) and join (&), we don't need containment:
>
> > A & B = A <=> A ⊆ B.
>
> Are you constructing an internal language or an external language.

Both. Although the visible end product is an external language, I am pursuing the method of designing an internal language as the foundation for the external language, a la the highly recommened "Concepts Techniques and Models of Computer Programming", by Van Roy and Haridi.

So for me, the question of "do we need X" has two forms: one for the internal language and one for the external language. The question for the internal language is of vital importance, because it defines the computational model, and that is something that needs to be as close to perfection as possible in the first release. You don't want to be changing the computational model in later releases! Contrast with the external language, which is built from abstraction facilities over the internal language that may even be visible and manipulable by the programmer.

> I agree it is unecessary internally, but I know from past experience
> that users are drawn to use containment when it is available to them.
> (At least, english speaking users are.)

Fair enough.

> ie. SQL users internalize the IN keyword long before they comprehend the
> EXISTS, ANY and ALL keywords.
>
> Then again, perhaps it is better from a pedagogic standpoint to remove
> that crutch.

Pedagogy is not a primary goal of mine. However the end design should be as easy to learn and as easy to use as is consistent with good design principles, so it's not clear if that point means anything.

> >>Can you describe the full operation? Maybe show an inverse for a join
> >>that is not a cartesian product?
>
> > SuppliesParts
> > S P
> > 7 1
> > 7 2
> > 8 1
>
> > Parts
> > P Desc
> > 1 that little thing with the knob
> > 2 the other thing we used the other day
>
> > Using $ for relational division
>
> > SuppliesParts $ Parts = {(S=7)}
>
> > This implies the existence of a relational remainder, in this case
> > {(S=8, P=1)}
>
> If we define
> SDP = Supplies $ Parts
> then Supplies = SDP X Parts
> granted the X is also join because SDP and Parts have no common attributes.
>
> Hmmmm....

Yeah, this freaked me out once I finally understood what Vadim was saying.

> > Also:
>
> > ItsLateAndICantThinkOfAnExample
> > A B
> > 1 1
> > 1 2
> > 1 3
> > 2 11
>
> > ItsLateAndICantThinkOfAnExample $ (Total = sum(B)) =
> > {(A=1, Total=6), (A=2, Total=11)}
>
> Consider:
>
> ItsEarlyAndICan
> A B C
> 1 1 b
> 1 2 a
> 1 3 c
> 2 11 d
>
> What does "ItsEarlyAndICan $ (Total = sum(B))" evaluate to?

ItsEarlyAndICan $ (Total = sum(B))
A C Total
1 b 1
1 a 2
1 c 3
2 d 11

In an aggregation expression A over a relation R, let us partition the attributes of R into three sets:

1) Those that are used as attributes to group by
2) Those that are used as parameters to aggregate functions
3) Those that are discarded

In SQL GROUP BY, 1 and 2 are explicit in the expression, and 3 is defined by its omission. This is convenient.

In relational division, 2 and 3 are explicit in the expression, and 1 is defined by its omission. This is inconvenient, but necessary to be consistent with being the inverse of join. :-(

So I would expect to add to the external language a construct similar to SQL's GROUP BY that was defined in terms of relational division. (Yet unlike in SQL was algebraic.)

Amusingly, we convert the division into a GROUP BY by multiplying the dividend. Adding attributes to the dividend removes them from the result. We can multiply by either the projection of the divisor over the attributes we wish to remove from the result, or by the entire domain of the attributes we wish to remove.

Or, in a third possibility, we can add those attributes as parameters to the aggregate function. So if we wanted to take ItsEarly and compute "sum(C) group by A" we would use an aggregate function sum', which took B as a parameter but didn't inspect that parameter.

def sum'(B, C) = sum(B)

ItsEarly $ (Total = sum'(B, C))
A Total
1 6
2 11

Lately I am running in to that construct: an ignored function parameter.

f:(a, b) | forall b: forall b': f(a, b) = f(a, b')

Does this have a name?

Marshall Received on Sun Apr 01 2007 - 18:07:07 CEST

Original text of this message