Re: general design issue
Date: Fri, 24 Apr 2009 15:49:21 GMT
Message-ID: <5clIl.23924$Db2.16398_at_edtnps83>
skink wrote:
> On 23 Kwi, 23:28, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>> skink wrote: >> >>> Bob Badour write: >>>> Other than being non-standard, what is your objection to using GLOB >>>> 'prefix*' ? >>> i'm sorry i overlooked GLOB in the previous post. ok gave it a try on >>> my sqlute and results are the same as LIKE: ~1200 ms... >> How many distinct items with the same prefix do you have? Presumably you >> have no more than about a couple dozen with the same prefix plus one >> character. >> >> Are the prefixes you use always the same length?
>
>
>
> ok, my data is typical english words dictionary. right now i have
> ~28000 words.
>
>
>
> when building it i counted that the number of distinct prefixes is
> ~105000
>
>
>
> prefix here means any substr(WORD, 1, N) where N is len(WORD), an WORD
> is any word in database
>
>
> for example for words 'internet' and 'intention' i would have
> prefixes:
>
> i
> in
> int
> inte
> inten
> intent
> intenti
> intentio
> intention
> inter
> intern
> interne
> internet
>
>
> basically idea of my app is that user wants to find some key, lets say
> its 'internet'.
>
>
>
> when it presses that key he/she enters partial key (prefix) like 'i',
> 'in', 'int', 'inte' etc.
>
>
>
> and during that operation i'd like to show only the buttons that lead
> to valid (existing in database word)
Okay, maybe this will help. I found a list of 58112 words called
'corncob' or somesuch and put it into an sqlite table (no idea what
version of sqlite it is, but it will be whatever the latest version of
Firefox ships with). When I defined the table, I didn't specify an
index, but I did specify that the text column called 'key' is 'primary'.
Just to be closer to your situation, I added a second text column
called 'value' and populated it with the text 'value'. (It seems there
are a couple of duplicates in the list or there is some problem with
sqlite indexing because I had to use an 'INSERT OR IGNORE...' statement
to populate, which resulted in a table with 58109 rows.)
Here are my results, using statements that might be closer to what you want:
select distinct substr(key,5,1) from words where key GLOB 'inte*' had
an ET of 9
returned the letters g,l,m,n,r,s
whereas
select distinct substr(key,5,1) from words where key like 'inte%' had
an ET of 314
So, about a 30:1 difference between LIKE and GLOB.
BTW, this list has 331 words that start with the sequence 'inte'.