Re: Efficient date range search?

From: Kieran <kieran_at_dunelm.org.uk>
Date: Mon, 07 Oct 2002 20:19:30 +0100
Message-ID: <3DA1DE42.4020507_at_dunelm.org.uk>


mvh_at_ix.netcom.com wrote:
> Does anybode know a good (efficient) algorithm for the following?
>
> Imagine that I have a lot of entries of the form (sorry if the SQL is
> messed up):
>
> CREATE TABLE "pets" (
> name VARCHAR(20);
> "born" timestamp;
> "died" timestamp;
> );
>
> and I have a LOT of pets (let's say millions) and some don't live too
> long (mice, fruitflies, whatever), and some do (parrots, elephants).
>
> I would like to make a query to say
>
> on july 4 of last year, what pets were alive?
>
> and I would like to make this query right to the minute
>
> on july 4 of last year at 7:01 PM what pets were alive?
>
> I can't figure out how to index or query this in a manner that isn't
> going to devolve into a linear search, which would be too slow.
>

The obvious query will be:

select * from pet where my_date > born and (my_date < died or died is null)

If there is a B-Tree index on born, the RDBMS can find the index node that partitions the tree into those born before my_date and those born after it very quickly (O(log n)), and similarly for died.

The RDBMS can use this to obtain a set of row_ids (or similar) of pets born before my_date and a set of row_ids of pets that died after my_date. This can all be done without accessing the actual table, as can intersecting the two sets to give the set of row_ids of pets we are interested in. These can then be looked up in the actual table and returned.

For a large, evenly distributed table (with up-to-date statistics) this should be considerably quicker than scanning the entrire table.

It's possible that a very clever RDBMS may be able to optimise this further if you add the following constraints:

born <= died
(died - born) < 250

This would in theory allow the RDBMS to discard those born 250 years before my_date as well as those that died 250 years after my_date.

It would be interesting to know whether any commerical RDBMSs can do this.

Regards,
Kieran Elby Received on Mon Oct 07 2002 - 21:19:30 CEST

Original text of this message