Re: choice of character for relational division
Date: 2 Apr 2007 12:00:43 -0700
Message-ID: <1175540443.603337.243900_at_d57g2000hsg.googlegroups.com>
On Apr 2, 11:31 am, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
> Marshall wrote:
> > On Apr 1, 10:04 am, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>
>
> Are you saying your language will use set algebra plus quantification
> for everything?
My ambitions for the constraint system are extremely
high, probably unrealistically so.
> >>So, a join is the inverse of a divide, but we don't necessarily have an
> >>inverse of a join. For that, we would have to specify the set of join
> >>attributes, and the result would necessarily omit any unmatched tuples
> >>that were in the original relations. Hmmmm... off the top of my head I
> >>wonder if such a thing would even have a unique solution?
>
> > The "inverse" here is the inverse the way div is the inverse of
> > int multiply. There are remainders. And yes, the solution is not
> > unique in general because of the specification of attributes
> > issue. However there is a unique minimal solution: the one that
> > uses the fewest attributes.
>
> >>>>ItsEarlyAndICan
> >>>>A B C
> >>>>1 1 b
> >>>>1 2 a
> >>>>1 3 c
> >>>>2 11 d
>
> >>>>What does "ItsEarlyAndICan $ (Total = sum(B))" evaluate to?
>
> >>>ItsEarlyAndICan $ (Total = sum(B))
> >>>A C Total
> >>>1 b 1
> >>>1 a 2
> >>>1 c 3
> >>>2 d 11
>
> >>>In an aggregation expression A over a relation R, let us partition the
> >>>attributes of R into three sets:
> >>>1) Those that are used as attributes to group by
> >>>2) Those that are used as parameters to aggregate functions
> >>>3) Those that are discarded
>
> >>>In SQL GROUP BY, 1 and 2 are explicit in the expression, and 3 is
> >>>defined by its omission. This is convenient.
>
> >>>In relational division, 2 and 3 are explicit in the expression, and
> >>>1 is defined by its omission. This is inconvenient, but necessary
> >>>to be consistent with being the inverse of join. :-(
>
> >>How would one have to alter the expression above to discard C?
>
> > Any of the three ways I list immediately below:
>
> > 1) Multiply the dividend by the projection of the divisor over
> > the attribute you want to discard (multiply here is cartesian
> > product)
> > 2) Multiply the dividend by the domain of the attribute you
> > want to discard
>
> I don't get it. How does multiplying {A, B, C} by {C} get rid of C?
> ie. (ItsEarlyAndICan & {C}) $ (Total = sum(b))
>
> Did you meant to say "multiply the divisor by the projection of the
> dividend over the attribute you want to discard" ?
Yeah.
>
>
> > 3) Add a "Highly Error Correlated" parameter to the aggregate
> > function. (That is, a parameter that isn't used in calculating
> > the aggregate.)
>
> > This is more an issue for the internal language than the
> > external one; I don't expect hardly anyone to be interested
> > in these issues for practical reasons; I'd expect people to
> > use the GROUP BY construct that behaves in a more
> > convenient and intuitive way.
>
> > In fact, I'm not even altogether sure that it's worth putting
> > this into the internal language. But it's interesting to
> > do the math.
>
> >>>So I would expect to add to the external language a construct
> >>>similar to SQL's GROUP BY that was defined in terms of
> >>>relational division. (Yet unlike in SQL was algebraic.)
>
> >>>Amusingly, we convert the division into a GROUP BY by
> >>>multiplying the dividend. Adding attributes to the dividend
> >>>removes them from the result. We can multiply by either
> >>>the projection of the divisor over the attributes we wish
> >>>to remove from the result, or by the entire domain of
> >>>the attributes we wish to remove.
>
> >>>Or, in a third possibility, we can add those attributes as
> >>>parameters to the aggregate function. So if we wanted
> >>>to take ItsEarly and compute "sum(C) group by A" we would
> >>>use an aggregate function sum', which took B as a parameter
> >>>but didn't inspect that parameter.
>
> >>>def sum'(B, C) = sum(B)
>
> >>>ItsEarly $ (Total = sum'(B, C))
> >>>A Total
> >>>1 6
> >>>2 11
>
> >>That seems quite yucky from an aesthetic point of view.
>
> > Which one is yucky? The third one? All of them?
>
> The third one. The others simply did not make sense to me. I understood
> the third one which I found kludgy or hackish -- in short yucky.
Okay.
> >>>Lately I am running in to that construct: an ignored function
> >>>parameter.
>
> >>>f:(a, b) | forall b: forall b': f(a, b) = f(a, b')
>
> >>>Does this have a name?
>
> >>Um, in programming I would call it "highly error-correlated". HEC for
> >>short, maybe?
>
> > Heh.
>
> Seriously. From what I recall, unused parameters and unused local
> variables correlate very highly with bugs.
No, I completely buy it. But remember I'm talking language
semantics here, not techniques for writing code. The division
part I don't expect will get direct use, however it is useful to
understand
its mathematical properties.
Marshall Received on Mon Apr 02 2007 - 21:00:43 CEST