UNION ALL and SELECT DISTINCT

From: David Cressey <dcressey_at_verizon.net>
Date: Wed, 05 Apr 2006 13:47:22 GMT
Message-ID: <KTPYf.2011$Vy3.1578_at_trndny02>



The nice folks who gave us the SQL language adopted two different conventions regarding duplicates in result tables.

In the SELECT operator, they chose to have SELECT mean the same as SELECT ALL. If you want SELECT DISTINCT, you have to say so.

Notice that UNION is just the reverse. If you want UNION ALL, you have to say so. If you just say UNION, you get only distinct rows in the result.

I think it would have been nicer if they had adopted the same convention in both cases. I like the way UNION does it. So I'd like it if SELECT, by default, eliminated duplicates. SELECT ALL could be used for what is now meant by SELECT.

Of course, it's too late to change SQL at this juncture.

I'd also like it if the optimizer could figure out when SELECT ALL and SELECT DISTINCT are equivalent, and make use of that in picking the optimal strategy. If the optimizer has to make a mistake, I'd prefer it to pick a strategy that's correct but not optimal, and not pick a fast but incorrect strategy.

(sarcasm alert) I would hope most people would agree. But sometimes, I'm not so sure they do. Received on Wed Apr 05 2006 - 15:47:22 CEST

Original text of this message