| Oracle FAQ | Your Portal to the Oracle Knowledge Grid | |
Home -> Community -> Usenet -> comp.databases.theory -> Re: "Fuzzy" text search using n-grams (bigrams) -- speed?
crazyhorse wrote:
...
> I think the string distance is too expensive to compute in a database,
> and again, stemming is not really what I need.
>
> For this application, it's also not just misspelled words -- it's
> skipping a word in the movie title, or using an alternate form (i.e.
> "Stephen" or "Steven"), or specifying a longer version when we only
> have a shorter title in our database. I'm looking for a fuzzy,
> flexible search in general that can be implemented in a database.
>
> No real strong ideas for this, huh?
>
How about a multi-layer approach?
For example, reduce all forms of names with multiple common forms to a single form. Do spelling correction with a US English dictionary. Delete articles ("the","a" ...).
Patricia Received on Thu Oct 25 2007 - 10:06:03 CDT
![]() |
![]() |