Re: UNION ALL and SELECT DISTINCT

From: Marshall Spight <marshall.spight_at_gmail.com>
Date: 5 Apr 2006 08:24:37 -0700
Message-ID: <1144250677.666808.22650_at_v46g2000cwv.googlegroups.com>


David Cressey wrote:

>

> I think it would have been nicer if they had adopted the same convention in
> both cases. I like the way UNION does it. So I'd like it if SELECT, by
> default, eliminated duplicates. SELECT ALL could be used for what is now
> meant by SELECT.

I agree, but I think I would want it taken farther. SQL's approach of we-use-bags-but-we-pretend-they're-sets is screwey. Why not just use sets? Bags strike me as being of limited utility compared to sets. And I'd rather have-sets-but-implement-bags-with-them for those ultra-rare occasions when I want bags.

> Of course, it's too late to change SQL at this juncture.

Yup.

> I'd also like it if the optimizer could figure out when SELECT ALL and
> SELECT DISTINCT are equivalent, and make use of that in picking the optimal
> strategy. If the optimizer has to make a mistake, I'd prefer it to pick a
> strategy that's correct but not optimal, and not pick a fast but incorrect
> strategy.

I have on occasion been quite surprised to discover a SELECT DISTINCT working faster than a SELECT that produced identical results. The mental
model of "sets are like bags but you have the added work of removing duplicates" is obviously not valid.

Marshall Received on Wed Apr 05 2006 - 17:24:37 CEST

Original text of this message