Re: Help with our Search Engine

From: Robert Klemme <bob.news_at_gmx.net>
Date: 10 Feb 2003 08:54:21 -0600
Message-ID: <b27sf3$1868sq$1_at_ID-52924.news.dfncis.de>


"Javiermontebruno" <javiermontebruno_at_yahoo.com> schrieb im Newsbeitrag news:8d849073.0302080657.56b50d94_at_posting.google.com...
> We've built a search engine (Platform = ASP .NET/SQL server 2000) that
> is composed with the following characteristics:
>
> -An spider that extracts all text found on 50,000 websites.
>
> -A parser that that extracted different pieces (Tags) of information
> into Database.
>
> -We are using SQL server full-text to index the data and then to
> perform the searches.
>
> -After the search is perform we have a ranking to display the website.
> The ranking is performed basically according to where the string was
> found, for example if the string was found in the title it has a
> higher point than if it was found in the body of the page.
>
> The Engine outputs results ok, the problem is that it is slow. (It
> takes from 5 to 60 seconds in a server with 1,2 Gigas of CPU)
>
> I don't know if someone know a system that would work better than SQL
> server full-text search and that would enable our engine to ouput
> results faster.

Well, you could replace the SQL Server full index by an index table yourself. This would give you the additional control, which words are beeing indexed. You could have a table like this:

docid (int), word (varchar[50]), count (int) with an index on word and count.

Downside is, you will have to parse every document yourself and fill this index. But you could do more fine tuning by having one index table per document region (header, body etc.)

> We are looking for any kind of advice. We are willing to pay also.

Not necessary, thanks!

Regards

    robert Received on Mon Feb 10 2003 - 15:54:21 CET

Original text of this message