Keyword searches
From: Simon Richter <geier_at_phobos.fs.tum.de>
Date: Sun, 15 Jul 2001 22:31:50 +0200
Message-ID: <Pine.LNX.4.31.0107152159000.2609-100000_at_phobos.fachschaften.tu-muenchen.de>
Is there a better way to do keyword searches?
What is the most effective way to search through this data? I think it
is an n-dimensional tree, with n being the number of bits, but I could
be wrong.
I'd like to keep my database in a mmap()ed file, so that I don't need
to regenerate the database each time the program is started. Do you
have any hints on how to arrange the data in this file (current plan
below)?
Date: Sun, 15 Jul 2001 22:31:50 +0200
Message-ID: <Pine.LNX.4.31.0107152159000.2609-100000_at_phobos.fachschaften.tu-muenchen.de>
Hi,
Simon
File layout:
Part 1: Search tree
struct node {
long offset_if_zero;
long offset_if_one;
}
If one of the offsets is zero, then there is no data that matches. If the offset is negative, (-offset)-1 points into part 2 of the file.
Part 2: Lists of nodes with equal hashes
-> Zero-terminated lists of offsets into part three.
Part 3: Data
-- GPG public key available from http://phobos.fs.tum.de/pgp/Simon.Richter.asc Fingerprint: DC26 EB8D 1F35 4F44 2934 7583 DBB6 F98D 9198 3292 Hi! I'm a .signature virus! Copy me into your ~/.signature to help me spread!Received on Sun Jul 15 2001 - 22:31:50 CEST