Re: GROUP BY

From: Brian Selzer <brian_at_selzer-software.com>
Date: Sat, 19 May 2007 10:38:41 -0400
Message-ID: <SZD3i.1032$C96.283_at_newssvr23.news.prodigy.net>


"Marshall" <marshall.spight_at_gmail.com> wrote in message news:1179540817.981337.87990_at_h2g2000hsg.googlegroups.com...
> On May 18, 11:32 am, Marshall <marshall.spi..._at_gmail.com> wrote:
>>
>> Discussion welcome.
>
> Ha ha! <cough>
>
> Okay, it appears I made (at least) two errors.
>
> . The functions mentioned in 4) may only use as arguments
> the attributes listed in *2) or* 3).
>
> . select b from R group by b
>
> should have been left as-is; you can't drop the "group by b"
> without adding "distinct".
>
> I blame Paris Hilton, the SQL-92 committee, or George Bush.

Isn't it strange that George Bush gets blamed for everything that goes wrong in the world? Isn't it ironic that Saddam Hussein was murdering 30,000 civilians outright each year to maintain his reign of terror, and stole so much Oil-for-Food money (to the tune of $10 billion) that 60,000 childeren under the age of five were starving to death each year, yet George Bush is still blamed for the 15,000 civilians who have been killed (mostly by terrorists) each year since the invasion? Isn't it also ironic that George Bush is vilified both for invading Iraq, where roughly 90,000 civilians were meeting untimely deaths each year, and for failing to invade Darfur, where roughly 100,000 civilians are meeting untimely deaths each year?

> If none of those wash, I'll grudgingly accept personal
> responsibility. Anyway, apparently I wasn't saying anything
> the least bit surprising.
>
> So: what *about* that extended extend?
>
> Extend is a particular form of join, yes? So perhaps we
> ought to apply generators with join?
>
> I note that there is an issue with regards to the uniqueness
> of values produced by a generator. It's not clear to me if
> that ought to be required or not. If it's required, it's not clear
> to me if it's enforceable, since I don't see how to prove uniqueness.
>
> If the outputs of the generator aren't unique, then we have
> a function from a tuple to a bag, which also matches the
> fact that aggregate functions are functions from a bag to
> a tuple.
>

I don't agree with this. An aggregate function iterates over a relation, taking a projection of each tuple to find the values required to calculate the result. While it is convenient to visualize the collection of values for an attribute in a relation as a bag, it should not be ignored that any particular combination of values in that bag depends solely on the existence of a particular *set* of tuples.

> Ex: a function n_ones(n), which takes a natural number and
> generates the number 1 that many times. This is a subset of
> the inverse of sum().
>
> n_ones(3) = {| 1, 1, 1 |}
>
> sum( {| 1, 1, 1 |} = 3
>
> So generators are the inverses of aggregates, and join is the inverse
> of group by, and generators are applied with join and aggregates
> are applied with group-by.
>
>
> Marshall
>

Frankly, I don't see how there can even be an inverse for an aggregate. When you aggregate many distinct values into one, information is lost. How can you possibly reverse the process without that information? For example, if you have 30 students in a class where 'C' is the average grade, how can you tell who aced the course or who failed? Received on Sat May 19 2007 - 16:38:41 CEST

Original text of this message