Re: Aggregation is Inverse of Join

From: Walt <wamitty_at_verizon.net>
Date: Mon, 15 Jan 2007 16:41:24 GMT
Message-ID: <U8Oqh.12298$Ch1.3281_at_trndny04>


"Marshall" <marshall.spight_at_gmail.com> wrote in message news:1168808901.723906.156680_at_s34g2000cwa.googlegroups.com...
> A while back, we got another brilliant insight from Vadim that
> aggregation including group-by is the inverse operation of join.
>
> As a possibly amusing side note on the human condition, here
> is how my understanding of his insight went:
>
> Vadim: aggregation is inverse of join.
> Marshall: what?
> Vadim: you heard me.
> Marshall: um, okay, I kind of see, but here are a bunch
> of objections.
> Vadim: those don't hold up because blah blah blah.
> [five months pass]
> Marshall: I get it!
>
>
> Anyway, the basic idea is that sum() can be seen to be a many:one
> relation that, when we divide a relation by it, with a little renaming,
> will aggregate a column and group by the other columns.
>
> A:{a, b, i}
> sum:{i *-> total}
>
> (Relation A has attribute a, b, and i.
> Sum is an aggregate function taking bag-of-i to total.)
>
> The expression:
>
> A / sum
>
> is the same as:
>
> select a, b, sum(i) from A
>
> Interestingly, if we *multiply* the divisor by an attribute, we alter
> the grouping.
>
> A / (b * sum)
>
> is the same as:
>
> select a, sum(i) from A
>
> We can consider multiplication by an attribute as being either a
> join with the projection of the universal relation U over the
> attribute,
> or a join with the projection of the dividend over the attribute.
> Or possibly a join with the domain of the attribute. (Join in this
> case being a cross product.)
>
> There are renaming issues involved in making this work, but
> they are not very interesting so I'm not going to discuss them
> here.
>
> Anyway, I've been trying to organize my thoughts on this issue
> for a while, and I haven't had much success, so I'm just going
> to blurt out a bunch of things and we'll see where it goes.
>
> The idea of a many-to-one function is interesting. It's sort of
> the reverse of a multifunction. Multifunctions are sometimes
> expressed as generators in programming languages; perhaps
> we can imagine a language where many-to-one functions are
> expressed as aggregates.
>
> Vadim initially described the function as a bag -> singleton
> relation, however I note that only works for functions whose
> binary form is both commutative and associative. (Such as
> sum, which is aggregated addition, which is comm. and assoc.)
> If the binary function isn't both commutative and associative,
> then we must consider the order of the input, which is to say
> the function is list -> singleton.
>
> This is also reminiscent of the situation with
> one-to-many functions: we may have a generator
> that generates duplicate or ordered outputs, in which
> case we have to preserve order, or it might generate
> unique outputs, in which case we can consider it independent
> of order. For example, a generator that produces an
> infinite stream of ones could not be considered a set,
> because it would only have one member. However
> a generator that produces the natural numbers could
> be considered in sequence or out. Further I note many
> places where computation is ordered; eg. recursive
> functions. Thus I conclude that the most general form
> of multifunctions is list-to-single, or single-to-list.
>
> I also observe that I don't have a good name for many-to-one
> functions. One-to-many are generally called multifunctions;
> once I clued in to aggregation-as-division I've been calling
> many-to-one functions aggregates, but these will not generally
> be seen to be the same.
>
> Well, I've run out of steam. More later.
>
>
> Marshall
>
> PS. I told you I wasn't organized.
>

Interesting. It's going to take me weeks to get it.

In the meantime, what is the inverse of Project? Received on Mon Jan 15 2007 - 17:41:24 CET

Original text of this message