Re: "Aggregate" operators

From: Erwin <e.smout_at_myonline.be>
Date: Tue, 9 May 2017 13:54:45 -0700 (PDT)
Message-ID: <3af8638d-04b3-40b2-9cb9-88a0e467c2ed_at_googlegroups.com>


Op dinsdag 25 april 2017 11:04:25 UTC+2 schreef Norbert_Paul:
> Erwin wrote:
> > Looking for a *formal* definition of the term "aggregate" operator.
> >
> > The term is quite familiar and the one in common use to denote SQL operators such as MIN, MAX,
> > AVG that occur exclusively in connection with some GROUP BY.
> >
> > However, "occur exclusively in connection with SQL GROUP BY" is rather non-mathematical as a
> > formal definition and so, is anyone around here aware of something more rigorous as a
> > *definition* of when some operator is or is not an "aggregate" operator ?
>
> Formally "GROUP BY a, b,..." is a partition of the table, where "table" is informal for
> "set of tuples".
> https://en.wikipedia.org/wiki/Partition_of_a_set
> So the intermediate result of a GROUP BY-query is a set of those subtables of the table
> where the specified attributes have same values.
> The operators in the SELECT clause hence operate on whole non-empty columns of which
> MAX, MIN, etc. give the corresponding value.
> The attributes specified in the GROUP BY clause have equal values for each sub-table
> in the partition and hence are well-defined. Therefore they can be used without "aggregate".
>
> Where did you read "occur exclusively in connection with SQL GROUP BY"? You can use aggregate
> operators without GROUP-BY. This can then be considered as if they acted on the trivial partition
> {T} of the table T. (Formally, this could have been defined as the result of a GROUP BY-clause with
> an empty attribute list).

I believe the SQL standard even *defines* the behaviour of MIN, MAX, ... "without" a GROUP BY by relying on that very approach of tacitly assuming that an [empty] GROUP BY was nevertheless present. So arguably, there's always a GROUP BY, even if only implied.

The original motivation for the question was a discussion about whether there are any formal grounds for disqualifying, say, aggregated string concatenation as an "aggregate operator" because of its being ordering-dependent. Or, say, disqualifying AVERAGE as an "aggregate operator" on some grounds. Received on Tue May 09 2017 - 22:54:45 CEST

Original text of this message