Re: algorithms for selecting on non-key/non-indexed field
Date: 2000/07/02
Message-ID: <06596bf8.998cf2a8_at_usw-ex0104-087.remarq.com>#1/1
Marc Tardif <intmktg_at_Gloria.CAM.ORG> wrote:
>Aha, I think I finally understood what you mean. To search nonindexed
>fields using BDB, your best bet would be to use the recno
access method.
>You'll notice the format for this is like a flat text file
where each
>record is on its own line. Because of this simple structure,
you can then
>easily use a quick algorithm like some Boyer-Moore derivative
to search
>through all the records.
Ya this is what I'm talking about =) I was using Cursors in BDB
and running through the set of records using the recno format.
The data I was pulling in was cast to my C structure. At that
point I did my check to see if this was the right record. The
problem was that the cursor takes 18 seconds to run through a
million entries. This seems slow to me. Now to comment on your
suggestion: You would use a good string-matching algorithm and
run through the file looking for a match. Once you've found a
match, you need then to figure out what record number you are
at, given a worse case situation where you are dealing with
arbitrary sized records? It would seem that I wouldn't want to
use BDB Cursors, just open the file myself and rip through it?
>
>Of course, if you need one indexed field, you'll have to use a
btree or
Exactly. This is what I was thinking of doing when one
explicitly created an index on the table.
>hash which will point to a record in the recno table. That has
a problem
>of its own considering the key in a recno table might change
after
>deleting. The solution for this is to never delete a record in
the recno
>table and, instead, keep a fifo of the records which you want
removed.
>Then, when you add new records, check the fifo first to see if
some
>records are available for overwriting and, if not, create a new
record.
Got questions? Get answers over the phone at Keen.com.
Up to 100 minutes free!
http://www.keen.com
Received on Sun Jul 02 2000 - 00:00:00 CEST