Re: Search Engine

From: Bob Hairgrove <rhairgroveNoSpam_at_Pleasebigfoot.com>
Date: Sat, 02 Nov 2002 13:47:19 GMT
Message-ID: <3dc3d5b6.13994442_at_news.ch.kpnqwest.net>


On Fri, 1 Nov 2002 16:54:09 -0000, "stu" <smcgouga_at_nospam.co.uk> wrote:

>I am trying to design a search engine. Im building a prototype in access
>2000 to index urls. I have the following:
>
>Each URL has many keywords. A keyword could be related to more than one
>URL.
>
>I want to run a search that returns urls given a list of keywords to search
>on.
>
>* = pri key
>table Keyword(*ID, Keyword)
>table URL(*ID, URL)
>table URL_Keyword(*ID, fk_Keyword, fk_URL)
>
>URL 1----M URL_Keyword M-----1 Keyword
>
>search query:
>SELECT URL.URL, Count([keyword].[id]) AS Hits
>FROM URL INNER JOIN (Keyword INNER JOIN URL_Keyword ON Keyword.ID =
>URL_Keyword.fk_Keyword) ON URL.ID = URL_Keyword.fk_URL
>WHERE Keyword.Keyword= [Search1] or keyword = [Search2] or ..........etc
>GROUP BY url.url
>ORDER BY Count([keyword].[id]) DESC;
>
>Have i got the right idea? or is there a more efficient way to do this?
>
>I could add a description field to url then use KeyWordInContext (finding
>keywords delim by spaces) to an entry to Keyword (I should not use 'index'
>It could confuse).
>
>Thoughts?
>
>Cheers
>Stu
>
>

It depends on what you mean by a "hit". Are you counting multiple occurrences of the same keyword in the same URL? If so, I guess it's OK ... but what significance is "ID" in that case? Maybe a timestamp column or session ID or IP address of the visiting party would make more sense.

However, if you want to have just one record for each URL where a certain keyword might occur, then I would take ID out of the linking table and make the combination of the two foreign keys the primary key.

HTH Bob Hairgrove
rhairgroveNoSpam_at_Pleasebigfoot.com Received on Sat Nov 02 2002 - 14:47:19 CET

Original text of this message