Re: general design issue
From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Fri, 24 Apr 2009 12:56:19 -0300
Message-ID: <49f1e125$0$5501$9a566e8b_at_news.aliant.net>
>
> Okay, maybe this will help. I found a list of 58112 words called
> 'corncob' or somesuch and put it into an sqlite table (no idea what
> version of sqlite it is, but it will be whatever the latest version of
> Firefox ships with). When I defined the table, I didn't specify an
> index, but I did specify that the text column called 'key' is 'primary'.
> Just to be closer to your situation, I added a second text column
> called 'value' and populated it with the text 'value'. (It seems there
> are a couple of duplicates in the list or there is some problem with
> sqlite indexing because I had to use an 'INSERT OR IGNORE...' statement
> to populate, which resulted in a table with 58109 rows.)
>
> Here are my results, using statements that might be closer to what you
> want:
>
> select distinct substr(key,5,1) from words where key GLOB 'inte*' had
> an ET of 9
>
> returned the letters g,l,m,n,r,s
>
> whereas
>
> select distinct substr(key,5,1) from words where key like 'inte%' had
> an ET of 314
>
> So, about a 30:1 difference between LIKE and GLOB.
>
> BTW, this list has 331 words that start with the sequence 'inte'.
Date: Fri, 24 Apr 2009 12:56:19 -0300
Message-ID: <49f1e125$0$5501$9a566e8b_at_news.aliant.net>
paul c wrote:
> skink wrote:
>
>> On 23 Kwi, 23:28, Bob Badour <bbad..._at_pei.sympatico.ca> wrote: >> >>> skink wrote: >>> >>>> Bob Badour write: >>>> >>>>> Other than being non-standard, what is your objection to using GLOB >>>>> 'prefix*' ? >>>> >>>> i'm sorry i overlooked GLOB in the previous post. ok gave it a try on >>>> my sqlute and results are the same as LIKE: ~1200 ms... >>> >>> How many distinct items with the same prefix do you have? Presumably you >>> have no more than about a couple dozen with the same prefix plus one >>> character. >>> >>> Are the prefixes you use always the same length? >> >> ok, my data is typical english words dictionary. right now i have >> ~28000 words. >> >> when building it i counted that the number of distinct prefixes is >> ~105000 >> >> prefix here means any substr(WORD, 1, N) where N is len(WORD), an WORD >> is any word in database >> >> >> for example for words 'internet' and 'intention' i would have >> prefixes: >> >> i >> in >> int >> inte >> inten >> intent >> intenti >> intentio >> intention >> inter >> intern >> interne >> internet >> >> basically idea of my app is that user wants to find some key, lets say >> its 'internet'. >> >> when it presses that key he/she enters partial key (prefix) like 'i', >> 'in', 'int', 'inte' etc. >> >> and during that operation i'd like to show only the buttons that lead >> to valid (existing in database word)
>
> Okay, maybe this will help. I found a list of 58112 words called
> 'corncob' or somesuch and put it into an sqlite table (no idea what
> version of sqlite it is, but it will be whatever the latest version of
> Firefox ships with). When I defined the table, I didn't specify an
> index, but I did specify that the text column called 'key' is 'primary'.
> Just to be closer to your situation, I added a second text column
> called 'value' and populated it with the text 'value'. (It seems there
> are a couple of duplicates in the list or there is some problem with
> sqlite indexing because I had to use an 'INSERT OR IGNORE...' statement
> to populate, which resulted in a table with 58109 rows.)
>
> Here are my results, using statements that might be closer to what you
> want:
>
> select distinct substr(key,5,1) from words where key GLOB 'inte*' had
> an ET of 9
>
> returned the letters g,l,m,n,r,s
>
> whereas
>
> select distinct substr(key,5,1) from words where key like 'inte%' had
> an ET of 314
>
> So, about a 30:1 difference between LIKE and GLOB.
>
> BTW, this list has 331 words that start with the sequence 'inte'.
In either case, did it take longer than a second to return the results? Received on Fri Apr 24 2009 - 17:56:19 CEST