From: Laconic2 <laconic2_at_comcast.net>
Date: Sun, 25 Apr 2004 13:01:35 -0400
Message-ID: <NeGdnUf-jdgqcxbdRVn-vA_at_comcast.com>

I do want to talk about SELECT DISTINCT. But I guess some people felt I was hijacking someone else's thread. I don't want to do that. So I'll start my own thread.

I believe that newcomers to SQL should be taught how to do a SELECT DISTINCT before they learn how to do a plain old SELECT. I relaize that this is an unusual opinion, but please bear with me.

Consider this query:


        EMPLOYEES E,

The result of this query shows a relation between cities and department names. What if we leave off the DISTINCT? Well, in that case we get a result table that isn't, in general, in first normal form. I think it's worthwhile for students to get used to queries that produce a relation in the result, before they branch out into queries that don't. So SELECT DISTINCT is actually a more "fundamental" operation than "SELECT" is.

Here are the three most common objections to SELECT DISTINCT:

1.  But it's extra typing!
2.  But it's extra work for the optimizer!
3.  But I want those duplicates!

Let's talk about them one at a time:

  1. But it's extra typing! Yes, unfortunately it is. But that's the way SQL is defined. You'll learn when you can omit the DISTINCT, soon.
  2. But it's extra work for the optimizer! So what?
  3. But I want those duplicates! Aha! Why do you want to produce a result that is not a relation? Often there are legitimate answers to this question. But training the student to ask, automatically, whether the desired result is a relation or something else, is a good thing.

This ties rather well into the formula already put up in the other thread:

    RESULT = PROJECT (RESTRICT (JOIN (A, B))) When you sepcify DISTINCT, you are really projecting the result onto the space CITY, DEPARTMENT_NAME. Received on Sun Apr 25 2004 - 19:01:35 CEST

Original text of this message