Re: Access Paths: Will these produce equivalent results?
Date: Fri, 09 May 2003 22:24:04 GMT
Message-ID: <8UVua.15613$h42.5565_at_twister.nyc.rr.com>
"techy lname" <techy789_at_yahoo.com> wrote in message
news:3b990a48.0305091041.157ec3e8_at_posting.google.com...
> We are migrating an MS SQLServer database to Oracle, and have come
> across a query that runs significantly faster on SQLServer than
> Oracle. It is an MS Access generated sql that looks like:
>
> select colA, colB, sum(colC), sum(colD)
> from tab1
> where colE = 'somelit'
> group by colA, colB
> having colA = 'somelit' and colB = 'somelit'
>
> When comparing access plans between Oracle and SQLServer, Oracle
> first performs the roll up, then filters the results. However,
> SQLServer applies the "having " predicate to tab1 *first*, then
> performs the summations. In fact for SQLServer, the access path for
> the above query is identical for this one:
>
> select colA, colB, sum(colC), sum(colD)
> from tab1
> where colE = 'somelit' and colA = 'somelit' and colB = 'somelit'
> group by colA, colB
>
> My question is, is this correct behavior? Is SQLServer guaranteed to
> produce the same results, no matter what's in the tables?
This is a correct transformation. SQL is (in)famous for providing multiple ways to skin a proverbial cat. Let's write the original query equivalently without using GROUP BY and HAVING as
SELECT colA,
colB, (SELECT SUM(colC) FROM tab1 WHERE colA = D.colA AND colB = D.colB) AS total_colC, (SELECT SUM(colD) FROM tab1 WHERE colA = D.colA AND colB = D.colB) AS total_colD FROM (SELECT DISTINCT colA, colB FROM tab1 WHERE colE = 'somelit') AS D -- GROUP BYWHERE colA = 'somelit' AND colB = 'somelit' -- HAVING
The WHERE clause
WHERE colA = 'somelit' AND colB = 'somelit'
will restrict D to at most 1 row. If we place this restriction within the derived table D, since A AND (B AND C) is equivalent to A AND B AND C, we'll achieve the same effect, i.e.,
SELECT colA,
colB, (SELECT SUM(colC) FROM tab1 WHERE colA = D.colA AND colB = D.colB) AS total_colC, (SELECT SUM(colD) FROM tab1 WHERE colA = D.colA AND colB = D.colB) AS total_colD FROM (SELECT DISTINCT colA, colB FROM tab1 WHERE colE = 'somelit' AND colA = 'somelit' AND colB = 'somelit') AS D -- GROUP BY
If we now use the reverse of our initial transformation on this query so that we employ GROUP BY we get
SELECT colA,
colB, SUM(colC) AS total_colC, SUM(colD) AS total_colD
FROM tab1
WHERE colE = 'somelit' AND
colA = 'somelit' AND colB = 'somelit' GROUP BY colA, colB
which is the other query given.
This transformation should improve performance as restriction will be to rows and not groups so unnecessary groups won't be formed to only be discarded by the HAVING clause.
Hope this helps.
Regards,
jag
Received on Sat May 10 2003 - 00:24:04 CEST