Re: general design issue

From: Bob Badour <bbadour_at_pei.sympatico.ca>
Date: Fri, 24 Apr 2009 12:56:19 -0300
Message-ID: <49f1e125$0$5501$9a566e8b_at_news.aliant.net>

paul c wrote:

> skink wrote:
>

>> On 23 Kwi, 23:28, Bob Badour <bbad..._at_pei.sympatico.ca> wrote:
>>
>>> skink wrote:
>>>
>>>> Bob Badour write:
>>>>
>>>>> Other than being non-standard, what is your objection to using GLOB
>>>>> 'prefix*' ?
>>>>
>>>> i'm sorry i overlooked GLOB in the previous post. ok gave it a try on
>>>> my sqlute and results are the same as LIKE: ~1200 ms...
>>>
>>> How many distinct items with the same prefix do you have? Presumably you
>>> have no more than about a couple dozen with the same prefix plus one
>>> character.
>>>
>>> Are the prefixes you use always the same length?
>>
>> ok, my data is typical english words dictionary. right now i have
>> ~28000 words.
>>
>> when building it i counted that the number of distinct prefixes is
>> ~105000
>>
>> prefix here means any substr(WORD, 1, N) where N is len(WORD), an WORD
>> is any word in database
>>
>>
>> for example for words 'internet' and 'intention' i would have
>> prefixes:
>>
>> i
>> in
>> int
>> inte
>> inten
>> intent
>> intenti
>> intentio
>> intention
>> inter
>> intern
>> interne
>> internet
>>
>> basically idea of my app is that user wants to find some key, lets say
>> its 'internet'.
>>
>> when it presses that key he/she enters partial key (prefix) like 'i',
>> 'in', 'int', 'inte' etc.
>>
>> and during that operation i'd like to show only the buttons that lead
>> to valid (existing in database word)

>
> Okay, maybe this will help. I found a list of 58112 words called
> 'corncob' or somesuch and put it into an sqlite table (no idea what
> version of sqlite it is, but it will be whatever the latest version of
> Firefox ships with). When I defined the table, I didn't specify an
> index, but I did specify that the text column called 'key' is 'primary'.
> Just to be closer to your situation, I added a second text column
> called 'value' and populated it with the text 'value'. (It seems there
> are a couple of duplicates in the list or there is some problem with
> sqlite indexing because I had to use an 'INSERT OR IGNORE...' statement
> to populate, which resulted in a table with 58109 rows.)
>
> Here are my results, using statements that might be closer to what you
> want:
>
> select distinct substr(key,5,1) from words where key GLOB 'inte*' had
> an ET of 9
>
> returned the letters g,l,m,n,r,s
>
> whereas
>
> select distinct substr(key,5,1) from words where key like 'inte%' had
> an ET of 314
>
> So, about a 30:1 difference between LIKE and GLOB.
>
> BTW, this list has 331 words that start with the sequence 'inte'.

In either case, did it take longer than a second to return the results? Received on Fri Apr 24 2009 - 17:56:19 CEST

This message: [ Message body ]
Next message: paul c: "Re: storing survey answers of different data types"
In reply to paul c: "Re: general design issue"
Next in thread: paul c: "Re: general design issue"

Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]

Original text of this message