Path: dp-news.maxwell.syr.edu!spool.maxwell.syr.edu!drn.maxwell.syr.edu!news.maxwell.syr.edu!newsfeed.icl.net!newsfeed.fjserv.net!news.tele.dk!news.tele.dk!small.news.tele.dk!uninett.no!news.net.uni-c.dk!not-for-mail
From: Troels Arvin <troels@arvin.dk>
Newsgroups: comp.databases.theory
Subject: Re: full text indexing
Date: Mon, 08 Aug 2005 09:47:58 +0200
Organization: UNI-C
Lines: 32
Message-ID: <pan.2005.08.08.07.47.57.761539@arvin.dk>
References: <3kt2obF106t78U1@individual.net>
NNTP-Posting-Host: pig.genome.kvl.dk
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
X-Trace: news.net.uni-c.dk 1123487300 24106 130.226.13.80 (8 Aug 2005 07:48:20 GMT)
X-Complaints-To: usenet@news.net.uni-c.dk
NNTP-Posting-Date: Mon, 8 Aug 2005 07:48:20 +0000 (UTC)
User-Agent: Pan/0.14.2.91 (As She Crawled Across the Table)
Xref: dp-news.maxwell.syr.edu comp.databases.theory:32744

On Thu, 28 Jul 2005 23:58:34 +0200, Stefan Rybacki wrote:

> Do you have any good sources (books, links, ...) related to full
> text indexing?

The following books/papers are nice, but lack coverage of new (2003)
discoveries about construction of suffix arrays:

http://www.cs.helsinki.fi/juha.karkkainen/publications/amh_chapter7.pdf
http://www.aw-bc.com/catalog/academic/product/0,1144,0201398397,00.html
http://www.daimi.au.dk/~large/Paperpages/dshandbook02.htm
http://www.daimi.au.dk/~large/Paperpages/invitedalenex03.htm

One article covering the "new discoveries" about suffix array
construction:
http://www.siam.org/meetings/alenex05/papers/08rdementiev.pdf

About "String B-trees":
http://scholar.google.com/url?sa=U&q=http://www.acm.org/jacm/papers/1083.ps
(String B-trees are probably better than suffix arrays for RDBMS uses,
because String B-trees are inheritly "dynamic", in contrast to suffix
arrays which are static.)
(I believe that MySQL's full text indexes are implemented with String
B-trees.)

An "old", respected classic, but not very good on how to handle large
texts (on disk):
http://www.cambridge.org/uk/catalogue/catalogue.asp?isbn=0521585198

-- 
Greethings from Troels Arvin, Denmark

