Bit-Mapped Indexing Routines----Beta Testers Wanted
Date: 1996/11/20
Message-ID: <3292cef9.67716501_at_news.thenet.net>#1/1
Bit-Mapped Indexing Routines----Beta Testers Wanted
Bit-mapped indexing is still a relatively new concept in the DB field. It is with this in mind that I set out to produce a library of functions for creating, managing and manipulating bit-maps. Although the routines could be used for manipulating any form of bitmap, they were specifically designed for bit-mapped indexing.
There are a number of factors that make these routines superior to those currently used by Sysbase, Oracle, Redbrick and the other DB vendors that have implemented this BM (bitmap) technology:
- The ability store 100,000 bits in 14K. This is WORST case senario. At best, 100,000 bits can be stored in 34 bytes.
- This is THE major advantage - ALL THE ROUTINES ACT UPON THE COMPRESSED DATA, therefore:
- No de-compression time. ii) Extremely fast data manipulation because of the size of the data set being manipulated.
- This is another major advantage - THE ROUTINES OVERCOME THE PROBLEMS ASSOCIATED WITH HIGH CARDINALITY DATA. This is basically acheived by breaking the data into sections. If there is no data in a particular section, a BM for that section is
not stored. E.g. A bit-map is divided into 10 sections, each representing 10,000 bit positions, i.e. a 100,000 bit BM. If position 1, 10,000, and 100,000 is set, only 3 - 6 bytes at most would be required to store the BM positions. (This excludes section header information, which is about 8 bytes for every section, i.e. 24 bytes in the example above)
The BM functions include functions for:
- ANDing
- ORing
- XORing
- NOTing
- ORXing (deleting)
- XNORing
- NANDing
- NORIng
- Finding
- Search Next
- Search Previous
- Adding
- Deleting
- Appending..... and more.
Various logical operations (1 - 8) can be performed on 2 source data sets of 1,000,000 bits each, to produce a result data set in 0.2 seconds for any single operation. This is WORST case senario. Best case... I am unable to time them, as the timing function returns 0.0000. My test have been conducted on a Compaq 90Mz Pentium with 32MB memory.
A beta version of these routines will be available by 1 January 1997. The routines will comprise of a 16-bit Windows library, a 32-bit Windows library, a DOS library, and 16 & 32 Bit Windows DLL's and will be shipped complete with on-line documentation and help.
If anybody is interested in testing the routines, please send me an e-mail at:
bitmapindx_at_aol.com
The above e-mail address will be activated on 11/18/96 at 6:00PM EST. Received on Wed Nov 20 1996 - 00:00:00 CET