Bit-Mapped Indexing.....(Beta Testers Wanted)
Date: 1996/12/06
Message-ID: <19961206023100.VAA12135_at_ladder01.news.aol.com>#1/1
Bit-Mapped Indexing.....(Beta Testers Wanted) -------------------------------------------------------------------- Bit-mapped indexing is still a relatively new concept in the DB field.It is with this in mind that I set out to produce a library of functions for creating, managing and manipulating bit-maps. Although the routines could be used for manipulating any form of bitmap, they were specifically designed for bit-mapped indexing.
There are a number of factors that make these routines superior to those currently used by Sysbase, Oracle, Redbrick and the other DB vendors that have implemented this BM (bitmap) technology:
- The ability store 100,000 bits in 14K. This is WORST case senario. At best, 100,000 bits can be stored in 34 bytes.
- This is THE major advantage - ALL THE ROUTINES ACT UPON THE COMPRESSED DATA, therefore:
- No de-compression time. ii) Extremely fast data manipulation because of the size of the data set being manipulated.
- This is another major advantage - THE ROUTINES OVERCOME THE PROBLEMS ASSOCIATED WITH HIGH CARDINALITY DATA. This is basically acheived by breaking the data into sections. If there is no data in a particular section, a BM for that section is not stored. E.g. A bit-map is divided into 10 sections, each representing 10,000 bit positions, i.e. a 100,000 bit BM. If position 1, 10,000, and 100,000 is set, only 3 - 6 bytes at most would be required to store the BM positions. (This excludes section header information, which is about 8 bytes for every section, i.e. 24 bytes in the example above)
- Because of the fact that the bitmaps are devided into sections, the problem of having to lock entire bitmaps, before they can be updated, is eliminated, as you will only need to lock the section that needs to be updated
The BM functions include functions for:
- ANDing
- ORing
- XORing
- NOTing
- ORXing (deleting)
- XNORing
- NANDing
- NORIng
- Finding
- Search Next
- Search Previous
- Adding
- Deleting
- Appending..... and more.
The routines are able to perform various logical operations (1 - 8), on 2
source
data sets of 1,000,000 bits each, and produce a result data set in 0.2
seconds for any single operation. Once again this is WORST case
senario. Best case... I am unable to time them, as the timing function
returns 0.0000. My test have been conducted on a Compaq 90Mz Pentium
with 32MB memory.
A beta version of these routines will be available by 1 January 1997. The routines will comprise of a 16-bit Windows library, a 32-bit Windows library, a DOS library, and 16 & 32 Bit Windows DLL's and will be shipped complete with on-line documentation and help.(The routines were written in 100% ANSI C compliant code, so they can be ported to any other platform with minimal effort.)
If anybody is interested in testing the routines, please send me an e-mail at:
bitmapindx_at_aol.com Received on Fri Dec 06 1996 - 00:00:00 CET