Re: a database implementation (data structure) for particular queries?

From: <compdb_at_hotmail.com>
Date: Thu, 18 Oct 2012 23:41:28 -0700 (PDT)
Message-ID: <e21cfaab-772f-4f85-854b-7c14a4aac399_at_googlegroups.com>


On Wednesday, October 17, 2012 12:38:55 AM UTC-7, Ivan Shmakov wrote:
> I wonder if there's a database implementation (or data
> structure, or an index) well suited for queries like
> SELECT * FROM foo
> WHERE (a1 IN (a1a, ..., a1P)
> AND a2 IN (a2a, ..., a2Q)
> ...
> AND aN IN (aNa, ..., aNZ));

This is

     SELECT * FROM foo
     WHERE ((a1=a1a OR ... OR a1=a1P) AND ... AND (aN=aNa OR ... OR aN=aNZ))
or
     SELECT foo.*
     FROM foo
     JOIN (SELECT a1a AS a1 UNION ... UNION SELECT a1P as a1)
     ...
     JOIN (SELECT aNa AS aN UNION ... UNION SELECT aNZ as aN)

If your DBMS is not "well suited" to this then it's not clear what it could be "well suited" for. (Is your context even a DBMS?)

> Also, I'd be interested in a quick way to compute an estimate
> (an upper bound) of the number of rows to be returned

NATURAL JOINing on the same attribute set, the result is no larger than either operand. One operand is a cross product; its size is the product of constituent single-column sizes. So an upper bound is min(#rows(foo), P * ... * Z). (Assuming no duplicates in foo. Otherwise upper bound is #rows(foo).) Each result column can only have values from an operand column. So a stricter upper bound is min(#rows(SELECT a1 FROM foo), P) * ... * min(#rows(SELECT aN FROM foo), Z). But an upper bound is not likely an "estimate" of a typical case.

As Gene wrote, you need to tell us the assumptions behind your questions.

philip Received on Fri Oct 19 2012 - 08:41:28 CEST

Original text of this message