On Fri, 10 Feb 2006 09:13:18 +0000, Roy Hann wrote:
> falling into the trap of assuming that the first N rows
> will give the N highest values.
Thanks for bringing this up. I've see this potentially serious error in
code from many sources.
And there seems to be a terminology problem.
In my mind, there is
- "Simple limit" (potentially dangerous): When one wants no more
than n rows returned. It's safe if used in connection with
an ORDER BY on a column (or columns) with a uniquenessconstraint.
But if there is a possibility for tie conditions,
it's dangerous.
- "Top-n": Like 1, but may return more than n rows in case
of tie conditions. This is very often what people really
want, instead of 1.
- Limit+offset: Like 1, but skip the first x rows of the
result set (and start counting after having skipped
rows). Often used in pagination situations.
Somewhat safe when the ORDER BY sort is performed on a
column with unique values; else unsafe, because the
sorting is non-deterministic. I wrote "somewhat", because
it's conceivable that new rows are being added/changed/
deleted while someone is browsing through a paginated
presentation of the data; in a stateless context (such as
web pages), it may mean the the browsing user may miss
rows, or experience other inconsistent situations.
Now, it seems:
- Some use the word "quota query" for 2. No problem.
- Some use the "top-n" terminology for 1. This gives
potential for misunderstandings.
- It would be nice to have a better/more clear name for
1.
I've written about the subjects at http://troels.arvin.dk/db/rdbms/
(section: "Limiting result sets").
--
Greetings from Troels Arvin
Received on Sun Feb 12 2006 - 02:06:33 CST