Re: Can I do this using SQL?

From: Troels Arvin <troels_at_arvin.dk>
Date: Sun, 12 Feb 2006 09:06:33 +0100
Message-Id: <pan.2006.02.12.08.06.32.266000_at_arvin.dk>


On Fri, 10 Feb 2006 09:13:18 +0000, Roy Hann wrote:
> falling into the trap of assuming that the first N rows
> will give the N highest values.

Thanks for bringing this up. I've see this potentially serious error in code from many sources.

And there seems to be a terminology problem.

In my mind, there is

  1. "Simple limit" (potentially dangerous): When one wants no more than n rows returned. It's safe if used in connection with an ORDER BY on a column (or columns) with a uniquenessconstraint. But if there is a possibility for tie conditions, it's dangerous.
  2. "Top-n": Like 1, but may return more than n rows in case of tie conditions. This is very often what people really want, instead of 1.
  3. Limit+offset: Like 1, but skip the first x rows of the result set (and start counting after having skipped rows). Often used in pagination situations. Somewhat safe when the ORDER BY sort is performed on a column with unique values; else unsafe, because the sorting is non-deterministic. I wrote "somewhat", because it's conceivable that new rows are being added/changed/ deleted while someone is browsing through a paginated presentation of the data; in a stateless context (such as web pages), it may mean the the browsing user may miss rows, or experience other inconsistent situations.

Now, it seems:

  • Some use the word "quota query" for 2. No problem.
  • Some use the "top-n" terminology for 1. This gives potential for misunderstandings.
  • It would be nice to have a better/more clear name for 1.

I've written about the subjects at http://troels.arvin.dk/db/rdbms/ (section: "Limiting result sets").

-- 
Greetings from Troels Arvin
Received on Sun Feb 12 2006 - 09:06:33 CET

Original text of this message