Re: general design issue
Date: Fri, 24 Apr 2009 10:42:24 -0700 (PDT)
Message-ID: <6719178b-ac20-40aa-a962-2f00f31e9f19_at_c36g2000yqn.googlegroups.com>
On 24 Kwi, 18:48, paul c <toledobythe..._at_oohay.ac> wrote:
> skink wrote:
>
> ...
>
> > my only suspicion is that OS i'm using (google android) puts
> > additional layer (i use java iface) in order to access sqlite.
> > maybe its implementation is somehow done not right way...
>
> > thanks again
>
> Then the only suggestion I can think of is to use 'explain' and 'explain
> query plan' on some other table. Keep trying different queries until
> you get one that indicates something like "TABLE words WITH INDEX
> sqlite_autoindex_words_1" to show that an index was used. Then
> carefully compare that table's definition with the one that is giving
> you trouble, try to eliminate any differences even if they seem
> cosmetic, as you eliminate each difference, run explain query plan
> again. My guess is that you will locate one or more differences that
> are crucial.
>
> In other words, confirm that your platform will use an index sometimes.
> If it never does, it's a packaging problem. If it sometimes does,
> then you know whatever version of sqlite you've got should be capable of
> optimizing the query.
paul,
since i would need learn much about SQL to get some decent expertise i tried some experiments with different aproach and it seems that I found reasonable solution in terms of speed and storage.
when populating database i create in memory hash map with all prefixes - its ~105000 unique prefixes.
each value in that hash map contains number of words that match given prefix.
then i insert every prefix that has more that ten (10) words (its only 1615 such prefixes - i thought it will be much much more) to prefixes table.
my database is read only so i need to do it only once when creating database.
CREATE TABLE prefixes (prefix TEXT, bitmap INTEGER); CREATE TABLE words (key TEXT, value TEXT);
CREATE INDEX prefixesIdx ON prefixes (prefix); CREATE INDEX wordsIdx ON words (key);
this is how prefixes are distributed:
sqlite> select length(prefix), count(prefix) from prefixes group by length(prefix);
0|1 1|25 2|185 3|665 4|488 5|180 6|49 7|17 8|5
sqlite> select count(*) from prefixes;
1615
in my application for given prefix first i try to find prefix in prefixes table:
"select * from prefixes where prefix = 'inte';"
it takes ~15 ms
the bitmap is answer to my question (bit 0 indicates if there are words starting with "intea", bit 1 "inteb" etc)
if NOT found in prevoius select it means that i can do the following:
"select * from words where key >= 'inte' limit 10;"
it takes ~25ms
i could also use like/glob:
"select * from words where key like 'inte%';"
and i'm sure the length of result set will be < 10
iterating over that result set gives the anser to my question
what do you think of it?
thank you guys for your help, as i said in my original post i'm SQL novice and reading your replies i learnt something about LIKE/GLOB/susbtr and explain/explain query plan which i had no idea of
btw: is it possible in one query to get both prefixes(*) and words (*) ?
something like hypothetical:
"select * from prefixes where prefix = 'inte' + * from words where key
>= 'inte' limit 10"
Received on Fri Apr 24 2009 - 19:42:24 CEST