Re: Keyword searches
Date: Mon, 16 Jul 2001 18:44:29 +0200
Message-ID: <3B5319ED.8050102_at_phobos.fs.tum.de>
Rico wrote:
> The area you've just discovered is "information retrieval" and there's a
> whole host of techniques you can use. This particular technique is a
> simplified boolean model, but it has problems if the document is missing
> some of the keywords, even though it may be highly relevant to what the user
> wants. It also won't offer any ranking to tell the user how relevant the
> document is to their query (since a doc either matches or doesn't match).
Well, since this is going to be a Gnutella client, I'm not really talking about text documents... :-) Basically, the list of "keywords" associated with each file are the filename, with all special characters replaced by word delimeters and title information (which is read out of the files).
So what I really need is a search function that does exactly this, and I'm just wondering what would be the best way to implement it (another idea would be to use Sleepycat libdb3 (I already use it for other data) and feed it with lowercased words. The problem with this approach is that I need to AND the results together myself, which may be costly.
Simon Received on Mon Jul 16 2001 - 18:44:29 CEST