Re: choice of character for relational division

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Mon, 02 Apr 2007 18:31:59 GMT
Message-ID: <z_bQh.18565$PV3.192293_at_ursa-nb00s0.nbnet.nb.ca>


Marshall wrote:

> On Apr 1, 10:04 am, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>

>>Marshall wrote:
>>
>>
>>>>But we do ask whether one set is equal to another set. Are you
>>>>constructing a calculus or an algebra?
>>
>>>An algebra. (However the constraint language is a calculus.
>>>I am unclear if this is an issue. Like JOG's niggle over
>>>directed vs. undirected edges, I am unsure whether the thing
>>>to do is just "get over it.")
>>
>>I would tend to use a single language for both derivation and integrity
>>constraints. Why force anyone to learn two equivalent languages?

>
>
> A fair question.
>
> First let me say that despite some serious searching, I've been
> unable to come up with a firm definition of either "calculus" or
> "algebra."
>
> The basic language is algebraic. Things are wonderfully
> expressive this way, and nicely composable.
>
> Not sure I'll get his right, but let's look at an expression
> "find all the parts that are supplied by suppliers in Los Angeles."
>
> Algebra:
>
> SELECT PartNo from Parts natural join Suppliers where City = "Los
> Angeles"
>
> or in my notation:
>
> Suppliers & city = "Los Angeles" & SupplierParts | PartNo
>
> Calculus:
>
> forall (SupplierNo, City) in Suppliers:
> forall (SupplierNo', PartNo) in SupplierParts:
> SupplierNo = SupplierNo'
>
> (How do you project in the calculus?)
>
> Set Builder:
>
> {(PartNo) | (SupplierNo, City) ∈ Suppliers, (SupplierNo, PartNo) ∈
> SupplierParts }
>
> The algebra seems the clearest to me, although set builder
> is pretty good too.
>
> The calculus for the constraint language is really just the algebra
> with forall + exits, ranging over relation attributes, added in. It
> seems the easiest way to write constraints.
>
> Anyway, I dunno for sure, but I'll try it and if it doesn't work
> I'll try something else. (That's sorta my motto lately.)

Are you saying your language will use set algebra plus quantification for everything?

>>So, a join is the inverse of a divide, but we don't necessarily have an
>>inverse of a join. For that, we would have to specify the set of join
>>attributes, and the result would necessarily omit any unmatched tuples
>>that were in the original relations. Hmmmm... off the top of my head I
>>wonder if such a thing would even have a unique solution?

>
>
> The "inverse" here is the inverse the way div is the inverse of
> int multiply. There are remainders. And yes, the solution is not
> unique in general because of the specification of attributes
> issue. However there is a unique minimal solution: the one that
> uses the fewest attributes.
>
>
>
>
>>>>ItsEarlyAndICan
>>>>A    B    C
>>>>1    1    b
>>>>1    2    a
>>>>1    3    c
>>>>2    11   d
>>
>>>>What does "ItsEarlyAndICan $ (Total = sum(B))" evaluate to?
>>
>>>ItsEarlyAndICan $ (Total = sum(B))
>>>A C Total
>>>1 b 1
>>>1 a 2
>>>1 c 3
>>>2 d 11
>>
>>>In an aggregation expression A over a relation R, let us partition the
>>>attributes of R into three sets:
>>>1) Those that are used as attributes to group by
>>>2) Those that are used as parameters to aggregate functions
>>>3) Those that are discarded
>>
>>>In SQL GROUP BY, 1 and 2 are explicit in the expression, and 3 is
>>>defined by its omission. This is convenient.
>>
>>>In relational division, 2 and 3 are explicit in the expression, and
>>>1 is defined by its omission. This is inconvenient, but necessary
>>>to be consistent with being the inverse of join. :-(
>>
>>How would one have to alter the expression above to discard C?

>
>
> Any of the three ways I list immediately below:
>
> 1) Multiply the dividend by the projection of the divisor over
> the attribute you want to discard (multiply here is cartesian
> product)
> 2) Multiply the dividend by the domain of the attribute you
> want to discard

I don't get it. How does multiplying {A, B, C} by {C} get rid of C?

ie. (ItsEarlyAndICan & {C}) $ (Total = sum(b))

Did you meant to say "multiply the divisor by the projection of the dividend over the attribute you want to discard" ?

> 3) Add a "Highly Error Correlated" parameter to the aggregate
> function. (That is, a parameter that isn't used in calculating
> the aggregate.)
>
> This is more an issue for the internal language than the
> external one; I don't expect hardly anyone to be interested
> in these issues for practical reasons; I'd expect people to
> use the GROUP BY construct that behaves in a more
> convenient and intuitive way.
>
> In fact, I'm not even altogether sure that it's worth putting
> this into the internal language. But it's interesting to
> do the math.
>
>

>>>So I would expect to add to the external language a construct
>>>similar to SQL's GROUP BY that was defined in terms of
>>>relational division. (Yet unlike in SQL was algebraic.)
>>
>>>Amusingly, we convert the division into a GROUP BY by
>>>multiplying the dividend. Adding attributes to the dividend
>>>removes them from the result. We can multiply by either
>>>the projection of the divisor over the attributes we wish
>>>to remove from the result, or by the entire domain of
>>>the attributes we wish to remove.
>>
>>>Or, in a third possibility, we can add those attributes as
>>>parameters to the aggregate function. So if we wanted
>>>to take ItsEarly and compute "sum(C) group by A" we would
>>>use an aggregate function sum', which took B as a parameter
>>>but didn't inspect that parameter.
>>
>>>def sum'(B, C) = sum(B)
>>
>>>ItsEarly $ (Total = sum'(B, C))
>>>A Total
>>>1 6
>>>2 11
>>
>>That seems quite yucky from an aesthetic point of view.

>
> Which one is yucky? The third one? All of them?

The third one. The others simply did not make sense to me. I understood the third one which I found kludgy or hackish -- in short yucky.

>>>Lately I am running in to that construct: an ignored function
>>>parameter.
>>
>>>f:(a, b) | forall b: forall b': f(a, b) = f(a, b')
>>
>>>Does this have a name?
>>
>>Um, in programming I would call it "highly error-correlated". HEC for
>>short, maybe?

>
> Heh.

Seriously. From what I recall, unused parameters and unused local variables correlate very highly with bugs. Received on Mon Apr 02 2007 - 20:31:59 CEST

Original text of this message