Re: general design issue

From: paul c <toledobythesea_at_oohay.ac>
Date: Fri, 24 Apr 2009 15:49:21 GMT
Message-ID: <5clIl.23924$Db2.16398_at_edtnps83>


skink wrote:
> On 23 Kwi, 23:28, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:

>> skink wrote:
>>
>>> Bob Badour write:
>>>> Other than being non-standard, what is your objection to using GLOB
>>>> 'prefix*' ?
>>> i'm sorry i overlooked GLOB in the previous post. ok gave it a try on
>>> my sqlute and results are the same as LIKE: ~1200 ms...
>> How many distinct items with the same prefix do you have? Presumably you
>> have no more than about a couple dozen with the same prefix plus one
>> character.
>>
>> Are the prefixes you use always the same length?

>
>
>
> ok, my data is typical english words dictionary. right now i have
> ~28000 words.
>
>
>
> when building it i counted that the number of distinct prefixes is
> ~105000
>
>
>
> prefix here means any substr(WORD, 1, N) where N is len(WORD), an WORD
> is any word in database
>
>
> for example for words 'internet' and 'intention' i would have
> prefixes:
>
> i
> in
> int
> inte
> inten
> intent
> intenti
> intentio
> intention
> inter
> intern
> interne
> internet
>
>
> basically idea of my app is that user wants to find some key, lets say
> its 'internet'.
>
>
>
> when it presses that key he/she enters partial key (prefix) like 'i',
> 'in', 'int', 'inte' etc.
>
>
>
> and during that operation i'd like to show only the buttons that lead
> to valid (existing in database word)

Okay, maybe this will help. I found a list of 58112 words called 'corncob' or somesuch and put it into an sqlite table (no idea what version of sqlite it is, but it will be whatever the latest version of Firefox ships with). When I defined the table, I didn't specify an index, but I did specify that the text column called 'key' is 'primary'.   Just to be closer to your situation, I added a second text column called 'value' and populated it with the text 'value'. (It seems there are a couple of duplicates in the list or there is some problem with sqlite indexing because I had to use an 'INSERT OR IGNORE...' statement to populate, which resulted in a table with 58109 rows.)

Here are my results, using statements that might be closer to what you want:

select distinct substr(key,5,1) from words where key GLOB 'inte*' had an ET of 9

returned the letters g,l,m,n,r,s

whereas

select distinct substr(key,5,1) from words where key like 'inte%' had an ET of 314

So, about a 30:1 difference between LIKE and GLOB.

BTW, this list has 331 words that start with the sequence 'inte'. Received on Fri Apr 24 2009 - 17:49:21 CEST

Original text of this message