Re: Examples of SQL anomalies?

From: Marshall <>
Date: Tue, 1 Jul 2008 11:38:06 -0700 (PDT)
Message-ID: <>

On Jul 1, 9:26 am, -CELKO- <> wrote:
> >> Or one can even appeal to an imperative programmer's inherently procedural bias, and ask him how he'd most simply program it.
> He'd say:
> count = 0;
> for each item
> count = count + 1
> end for
> And what does that evaluate to if there are no items? <<
> Notice how you created zero from nothing?
> Things can get really bad. Your choices for defining a summation
> indexed over an empty are to define it as zero to get rid of it or to
> define it as undefined to get rid of it. The important thing is to
> get rid of it.

Not at all. I'm not "getting rid" of anything. The goal is not to brush things under the rug; the goal is to preserve the algebraic identities that are so useful in so many ways. To do that, it is necessary to support aggregating binary operations over collections that have zero or one members, even though the binary operation requires *two* arguments.

And it's not like this is in any way *hard* or anything. I mean, how old was I when I learned what an identity element was? I'm thinking it was junior high.

> Another convention in traditional Sigma notation is that when the
> initial value of an index is greater than the final value, the
> summation is zero. That is not a set-oriented approach to
> summations. Again, the idea that ordering changes the results is
> strictly a convention.

It's not a convention, and it's not order-dependent. Indexes are all greater than or equal to the initial value, and less than or equal to the final value. If the final value is smaller than the initial value, then there are no qualifying indexes.

> What do you want to do about 0^0?

0^0 = 1.

> How would you define 0/0 without a convention?

One *doesn't* define 0/0. Not every function is total.

> Instead of a FOR EACH loop, we might define summation
> this way in pseudo-SQL.
> RETURNS <numeric data type>
> AS
> DECLARE total <numeric data type>;
> SET total = MIN(my_set); -- exists in the data!
> SET my_set = my_set {MIN(my_set)}; -- remove element from countable
> set
> SET total = total + MIN (my_set); -- add element that exists in data
> to total
> UNTIL my_set = {}; -- use all the elements
> RETURN total;
> END;
> This handles a set with one element and we don't have to create a zero
> out of thin air. The loop removes the minimal element (could be max,
> could be random) from the set and adds it to the running total until
> the set is empty. If the set starts empty, we get a NULL.

That sucks. That is *terrible.* It goes to extra trouble to define a semantics that's obviously worse than the simpler thing.

> Now try these statements:
> CREATE TABLE Empty no rows possible
> CHECK (1=0));
> SELECT SUM(i) FROM Empty; -- returns null
> SELECT SUM(i) FROM Empty GROUP BY i; -- returns an empty table
> SELECT SUM(i), COUNT(i), COUNT(*) FROM Empty GROUP BY i; --empty
> table, 3 columns
> So why is GROUP BY not returning a NULL or a zero? That will give you
> a headache! A table always has at least one grouping, so there is
> always at least one row. Think of {{}}, a set whose element is the
> empty set. Let's keep going with another table that only has a NULL:
> SELECT SUM(i) FROM JustNull; -- returns null
> SELECT SUM(i) FROM JustNull GROUP BY i; -- returns null
> SELECT SUM(i), COUNT(i), COUNT(*) FROM JustNull GROUP BY i; -- null,
> 0, 1

All of these are simpler without NULLs.

Hey, NULLs were an interesting experiment. I don't think, back in the day, we could reasonably expect people to know what a bad idea it was. And so it's in the standard, and backward compatibility is important, so it stays in the standard. But don't think that means they were a good idea to begin with. They weren't. Sorry.

Marshall Received on Tue Jul 01 2008 - 20:38:06 CEST

Original text of this message