general design issue
Date: Wed, 22 Apr 2009 13:59:38 -0700 (PDT)
Message-ID: <be83370d-1df5-4697-b18c-8ae5b67eac2f_at_k38g2000yqh.googlegroups.com>
hi,
i don't know much about SQL - just basics: tables, indexes, simple
selects and
the schema is very straightforward:
CREATE TABLE words (key TEXT, entry TEXT)
CREATE INDEX wordsIdx ON words (key)
the database engine is quite slow (sqlite on embedded device) so
speed
the problem i have is as follows: for any key prefix (even not
existing in the database)
in other words: for prefix lets say 'inte' i'd like to know if there
are keys starting with
i cannot just run 26 queries for:
SELECT key FROM words WHERE key >= 'inteX' LIMIT 1
where X is letter [a..z] since each query takes ~15 ms
of course i don't need to iterate over all 26 letters: for example
when looking for 'intea'
i'm designing simple database storing key/value data (in fact its
read
only dictionary)
is by far the **most**
important since i need my data in short time (40-60 ms).
i'd like to know if there are keys starting with given prefix + 'a',
'b', 'c' etc
'intea', 'inteb', 'intec', 'inted' etc (just need true/false for each
letter not number of keys matching)
my query might return 'integral' (means that there is no "intea",
"inteb", "intec", "inted", "intee"
and "intef" prefixes since the first key found has prefix "integ") so
i could start with
queries...
my solution is to create second table storing prefix/32-bit-integer-
map where
each bit indicates one letter (for latin alphabet only 26 bits are
used)
CREATE TABLE prefixes (prefix TEXT, bitmap INTEGER) CREATE INDEX prefixesIdx ON prefixes (prefix)
and use it as follows:
SELECT bitmap FROM prefixes WHERE prefix = 'inte'
this is very fast since i need only one database access to find out the answer for my question.
however for longer prefixes it's quite expensive, e.g. for prefix
'internationalizatio' i have to store 19
prefixes just to find out there is one or two words starting with
that
prefix.
so my question is how to make it better keeping in mind that speed is
**key** factor but
database size is also important.
any ideas?
thanks
pskink
Received on Wed Apr 22 2009 - 22:59:38 CEST