Re: full text indexing

From: Troels Arvin <troels_at_arvin.dk>
Date: Mon, 08 Aug 2005 09:47:58 +0200
Message-ID: <pan.2005.08.08.07.47.57.761539_at_arvin.dk>


On Thu, 28 Jul 2005 23:58:34 +0200, Stefan Rybacki wrote:

> Do you have any good sources (books, links, ...) related to full
> text indexing?

The following books/papers are nice, but lack coverage of new (2003) discoveries about construction of suffix arrays:

http://www.cs.helsinki.fi/juha.karkkainen/publications/amh_chapter7.pdf
http://www.aw-bc.com/catalog/academic/product/0,1144,0201398397,00.html
http://www.daimi.au.dk/~large/Paperpages/dshandbook02.htm
http://www.daimi.au.dk/~large/Paperpages/invitedalenex03.htm

One article covering the "new discoveries" about suffix array construction:
http://www.siam.org/meetings/alenex05/papers/08rdementiev.pdf

About "String B-trees":
http://scholar.google.com/url?sa=U&q=http://www.acm.org/jacm/papers/1083.ps (String B-trees are probably better than suffix arrays for RDBMS uses, because String B-trees are inheritly "dynamic", in contrast to suffix arrays which are static.)
(I believe that MySQL's full text indexes are implemented with String B-trees.)

An "old", respected classic, but not very good on how to handle large texts (on disk):
http://www.cambridge.org/uk/catalogue/catalogue.asp?isbn=0521585198

-- 
Greethings from Troels Arvin, Denmark
Received on Mon Aug 08 2005 - 09:47:58 CEST

Original text of this message