Re: Group By Questions..

From: Frank Hubeny <fhubeny_at_ntsource.com>
Date: Sat, 02 Oct 1999 00:01:32 -0500
Message-ID: <37F591AC.C3AEDB8A_at_ntsource.com>


The order of the columns in the group by clause does affect the final result because a sort is performed on the columns in the group by clause based upon their order in the clause.

Jeff Hunter used an example to attempt to show this was not the case. But it is easier to show that something is not true with an example than to show that something is true. Nonetheless, I find examples more valuable than opinions. They provide a basis for getting to a verifiable result.

To continue with his example, I added a row that would make the sort on column C different than predicted from the third select statement.

SQL> select * from jh;

        A B C D
--------- --------- --------- ---------

        1         2         3         4
        4         2         3         4
        3         4         3         2
        1         4         3         2
        1         3         2         4

SQL> select b,c,d, avg(a) from jh group by b,c,d;

        B C D AVG(A)
--------- --------- --------- ---------

        2         3         4       2.5
        3         2         4         1
        4         3         2         2

SQL> select b,c,d, avg(a) from jh group by b,d,c;

        B C D AVG(A)
--------- --------- --------- ---------

        2         3         4       2.5
        3         2         4         1
        4         3         2         2

SQL> select b,c,d, avg(a) from jh group by c,b,d;

        B C D AVG(A)
--------- --------- --------- ---------

        3         2         4         1
        2         3         4       2.5
        4         3         2         2

Tom Verhagen claimed that it "does affect the outcome if B and C and D is NOT a primary or unique key". If I understood this correctly, one can rephrase this to read "if B and C and D is a primary or unique key, then it does not affect the outcome". This can be disproved by an example.

The next table is based upon the previous one except that it has a primary key on B, C and D. The results can be made different by selecting combinations that will make the sorts different depending on which column is sorted first and second.

SQL> select * from jhp;

        A B C D
--------- --------- --------- ---------

        1         2         3         1
        4         3         2         4
        3         2         2         2

SQL> select b,c,d, avg(a) from jhp group by b,c,d;

        B C D AVG(A)
--------- --------- --------- ---------

        2         2         2         3
        2         3         1         1
        3         2         4         4

SQL> select b,c,d, avg(a) from jhp group by b,d,c;

        B C D AVG(A)
--------- --------- --------- ---------

        2         3         1         1
        2         2         2         3
        3         2         4         4

SQL> select b,c,d, avg(a) from jhp group by c,b,d;

        B C D AVG(A)
--------- --------- --------- ---------

        2         2         2         3
        3         2         4         4
        2         3         1         1

The above examples do not help answer the concern with performance.

However, if a white paper suggested that such and such an ordering was a better performer with or without some index or other, I would still want to test that result using tkprof or autotrace on a suitable example just to make sure. This would lend credibility to my claims when I repeat those results to my client, manager or professor.

Frank Hubeny

Jimmy wrote:

> Hello all,
>
> Assume I have a table with column A, B, C and D.
>
> I wonder the following SQL statements will produce the same results:
>
> i) select B, C, D, avg(A) from AAA group by B, C, D;
> ii) select B, C, D, avg(A) from AAA group by B, D, C;
> iii) select B, C, D, avg(A) from AAA group by C, B, D;
> ....
>
> i.e. does the order of group by clause affect the final result (or
> performance, if any) of the SQL statement? Anyone knows white papers or
> documents discussing this issue?
>
> Thanks,
> Jimmy
Received on Sat Oct 02 1999 - 07:01:32 CEST

Original text of this message