Re: Db design for a flexible search
Date: 19 Dec 2005 18:15:50 -0800
Message-ID: <1135044950.857643.82510_at_z14g2000cwz.googlegroups.com>
ironcito wrote:
> Hello,
>
> I don't know if this is the appropiate newsgroup for this post, but
> it's the most related one I could find. Please advise if it isn't.
> I'm building a database containing music data, basically artist and
> tracks. For example, a record might be "14369, The Beatles, Help",
> where the first number is an ID key. I might later add things like
> albums, but it doesn't make a difference in this issue. The problem is
> that these names are often unusual, and I want to be able to build a
> search engine that is capable of overcoming this and displaying
> relevant results. For example, there's a song named "A.D.I.D.A.S.", and
> many people are likely to search for just "adidas". Or names like
> Beyoncé, P!nk, and such. I can see ways to overcome accented
> characters rather easily, but other cases are tougher. What would be
> the correct approach to this? I was thinking of adding an extra
> field(s) that would contain keywords that people are likely to search.
> Or is there a more automated way to solve this things?
> Any comment or idea is greatly appreciated. Thanks!
How you might go about this is database-implementation dependent, but here is one approach that solves the issue of punctuation and letter cases. There are also old soundex algorithms (and likely newer better ones) that can be further applied to this approach.
A soundex algorithm would do further manipulations such as stripping out vowels, equating d's and t's, etc. If you go the soundex route, then first showing exact matches (exact of the smashed-samecased version) and then soundex matches can be better than mixing in the "sounds like" with the "matches" in the list you return for selection by the user.
I hope that made sense. Cheers! --dawn Received on Tue Dec 20 2005 - 03:15:50 CET