Theoretically best search algorihtms

From: Thomas-Milius <Thomas-Milius_at_t-online.de>
Date: 2000/01/16
Message-ID: <5ff1708149%Thomas-Milius_at_Thomas-Milius.t-online.de>#1/1


Could anyone give me the solution or a hint where to find the solution of following problem.

If you have datasets inside a DB with datafields and you want to search your sets by the contents of one or more fields (eg SELECT ... WHERE ...). What are the best known algorithms to do such a search in a fast way? Solution is simple if you build an index for one field, even at more fields. But all this solutions have the disadvantage that you must know which fields will be searched in future. If there is only one field missing inside a multi field index, I am thinking this index can't be used at fast search. Of course you can generate an index for each cobinations of fields, but this would cause an exponential number of indexes!?!

Example table:

    Name Surname City Age

  1. Thomas Milius Stade 32
  2. Thomas Miller London 20
  3. Peter Miller Cambridge 25

Example Queries:
1. Who has the Name = Thomas (Simple: build Index on "Name") 2. Who has the Name = Thomas and Surname = Miller and lives

   in City = London (Simple: build multi field index on    "Name", "Surname" and "City")

  1. + 2. can be done in log(n) by binary search of the according index.
  2. Who has the Name = Thomas and lives in City = Stade

Index on "Name" is the same as using index on "Name", "Surname" and "City". Because "Surname" is undetermined binary search will reduce the possible results to datasets 1. and 2.. This datasets will have to be searched set by set on the value of "City". Of course there would be the possibility to create a third index. But at bigger tables this could increase the number of indexes in an inacceptable way.

What is the best theoretical solution to this problem? How are good database managing this problem?

Best Regards

Thomas Milius Received on Sun Jan 16 2000 - 00:00:00 CET

Original text of this message