Re: Search Engine
Date: 5 Nov 2002 10:02:09 -0600
Message-ID: <3dc7eaf9$0$76281$45beb828_at_newscene.com>
Stu:
For example, "apple", 3, 5, 8, 26, 97, 127, 456, ....
Then when you search it is easy to scan this list for the records that match your query.
A simple way to achieve this is to write out the word to a file followed by
the
record number of the record it is in. Do this for each and every word you
want to "index." Note: While you are doing this, you will see that they
are going
out in record number order. When you are finished you want to sort this
file
based on the word as the primary key and the record number as the secondary
key.
You will notice that once your sort is complete that your file is now
"inverted"
or in word order rather than record order. (That is why they call it an
"inverted" file.)
You can then further process this file by removing all but the first
occurance of each
word giving a list in the form of:
"apple", 3, 5, 8, 26, 97, 127, 456, ....
Now you are ready to begin searching.....
A traditional database has far too many inefficiencies in it to make it
appropriate for
doing searches such as you are describing.. At least that is the case if
your database
gets to be of any significant size.
-Art
-- Art Pollard http://www.lextek.com/ Suppliers of High Performance Text Retrieval Engines.Received on Tue Nov 05 2002 - 17:02:09 CET