Re: High Speed Text Searching Algorythms...
Date: 2000/07/23
Message-ID: <Mrye5.4692$Y5.160060_at_news1.sshe1.sk.home.com>#1/1
In comp.lang.c++ Derrick Coetzee <dc_at_moonflare.com> wrote:
> I know nothing about the reality of the subject myself, but it seems to me
> the best way is probably to assign each page a number, then add the numbers
> of all pages containing a certain word to that word's "page list". In this
> way, you can build up a sorted database of words:
> chicken 5 1 2 3 6 7
> apple 4 11 5
> monster 15 4 9 2
> Then you can treat these as sets... if they specify two words, you find the
> intersection:
> +chicken +apple
> {4 11 5} intersection {5 1 2 3 6 7} = { 5 }
> You can also organize each list according to its relavancy to that
> particular word.
> However, this idea works very badly for quoted multiword searches, unless
> you put in entries for each set of multiple words, and that'd get quite
> excessive.
-Adam Received on Sun Jul 23 2000 - 00:00:00 CEST