Oracle FAQ Your Portal to the Oracle Knowledge Grid
HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US
 

Home -> Community -> Usenet -> c.d.o.server -> Re: Bit Mapped Indexes

Re: Bit Mapped Indexes

From: Campbell White <Preveal2_at_vni.net>
Date: 1996/11/21
Message-ID: <329553F1.B29@vni.net>

Chuck Hamilton wrote:
>
> vikas_at_insight.att.com (Vikas Agnihotri) wrote:
>
> >CREATE BITMAP INDEX ...... The rest is the same. Note that you
> >have to put the
> > COMPATIBLE=7.3.2
> >parameter in your init.ora for this to work since this is a new feature
> >introduced since v7.3.2
>
> Is this documented anywhere?
>
> I've got 7.3.2.1 and it's not in the online documentation, nor the
> sqlplus help command.
>
> I'm not even sure what a bitmap index is.
> --
> Chuck Hamilton
> chuckh_at_dvol.com
>
> This message delivered by electronic sled dogs. WOOF!

Chuck,

Here's some info on bitmapped indexes from "RevealNet: Oracle Administration" http://www.revealnet.com Hope it helps.

Cam White
RevealNet, Inc.
(202)234-8557

Overview of Bitmapped Indexes

Bitmapped indexing is an indexing method added with Oracle V7.3 that can provide both performance benefits and storage savings. Bitmapped indexes are particularly useful for data warehousing environments because data is usually updated less frequently and ad hoc queries are more common.

Oracle provides four indexing methods: B-tree indexes (currently the most common), B-tree cluster indexes, hash cluster indexes, and bitmapped indexes. While B-tree indexes are the most effective method for “high-cardinality” data (data where there are many possible values, such as customer_name or phone_number), bit-mapped indexes are best for “low-cardinality” data (such as a column to indicate a person’s gender, which contains only two possible values: MALE and FEMALE).

Building a bitmapped index

A portion of a sample company’s customer data is shown below:

customer#     marital_status     region     gender     income_level
101           single             east       male       bracket_1
102           married            central    female     bracket_4
103           married            west       female     bracket_2
104           divorced           west       male       bracket_4
105           single             central    female     bracket_2
106           married            central    female     bracket_3

Since MARITAL_STATUS, REGION, GENDER, and INCOME_LEVEL are all “low-cardinality” columns (there are only three possible marital status’s, three possible regions, two possible genders, and four possible income levels), it’s appropriate to create bitmapped indexes on these columns. A bitmapped index should not be created on CUSTOMER#, because this is a “high-cardinality” column. Instead, an ordinary, unique B-tree index should be created on this column in order to provide the most efficient representation and retrieval.

The bitmapped index for the REGION column consists of three separate bit-maps, one for each region. These bit-maps are shown below:

region='east' region='central' region='west'

      1               0                  0
      0               1                  0
      0               0                  1
      0               0                  1
      0               1                  0
      0               1                  0

Each entry (or “bit”) in the bitmap corresponds to a single row of the CUSTOMER table. The value of each bit depends upon the values of the corresponding row in the table. For example, the bitmap REGION=‘east’ contains a one as its first bit; this is due to the fact that the REGION is ‘east’ in the first row of the CUSTOMER table. The bitmap REGION=‘east’ has a zero for its other bits because none of the other rows of the CUSTOMER table contain ‘east’ as their value for REGION.

Using a bitmapped index

Suppose you want to look at the demographic trends of the company’s customers. A sample question might be: “How many of our married customers live in the central or west regions?” which could be coded in an SQL query as:

select count(*) from customer
 where marital_status = 'married'
   and region in ('central','west');

Bitmapped indexes can be used to process this query. One bitmap is used to evaluate each predicate, and the bitmaps are logically combined to evaluate the AND’s and the OR’s. For example:

By counting the number of 1’s in the resulting bitmap above, the result of this query can be easily computed. If you further wanted to know the specific customers that satisfied the criteria, instead of simply the number of customers, the resulting bitmap would be used to access the table.

When to use bitmapped indexes

There are three things to consider when choosing an index method:

The advantage for using bitmapped indexes are their performance impact for certain queries and their relatively small storage requirements. Note however that bitmapped indexes are not applicable to every query and bitmapped indexes, like B-tree indexes, can impact the performance of insert, update, and delete statements.

Performance Considerations

Bitmapped indexes can provide very impressive performance improvements. Execution times of certain queries may improve by several orders of magnitude. The queries that benefit the most from bitmapped indexes have the following characteristics:

An advantage of bitmapped indexes is that multiple bitmapped indexes can be used to evaluate the conditions on a single table. Thus, bitmapped indexes are very useful for complex ad hoc queries that contain lengthy WHERE-clauses.

Storage considerations

Bitmapped indexes incur a small storage cost and have a significant storage savings over B-tree indexes. A bitmapped index can require 100 times less space than a B-tree index for a low-cardinality column.

Important Note:

A strict comparison of the relative sizes of B-tree and bitmapped indexes is not an accurate measure for selecting bitmapped over B-tree indexes. Because of the performance characteristics of bitmapped indexes and B-tree indexes, you should continue to maintain B-tree indexes on your high-cardinality data. Bitmapped indexes should be considered primarily for your low-cardinality data.

The storage savings are so large because bitmapped indexes replace multiple-column B-tree indexes. When using only B-tree indexes, you must anticipate the columns that will commonly be accessed together in a single query and then create a multi-column B-tree index on those columns. Not only does this B-tree index require a large amount of space, but it will also be ordered; that is, a B-tree index on (MARITAL_STATUS, REGION, GENDER) is useless for a query that only accesses REGION and GENDER. To completely index the database, you are forced to create indexes on the other permutations of these columns. In addition to an index on (MARITAL_STATUS, REGION, GENDER), there is a need for indexes on (REGION, GENDER, MARITAL_STATUS), (GENDER, MARITAL_STATUS, REGION), etc. For the simple case of three low-cardinality columns, there are six possible concatenated B-tree indexes. What this means is that you are forced to decide between disk space and performance when determining which multiple-column B-tree indexes to create.

With bitmapped indexes, the problems associated with multiple-column B-tree indexes is solved because bitmapped indexes can be efficiently combined during query execution. Three small single-column bitmapped indexes are a sufficient functional replacement for six three-column B-tree indexes. Note that while the bitmapped indexes may not be quite as efficient during execution as the appropriate concatenated B-tree indexes, the space savings provided by bitmapped indexes can often more than justify their utilization.

The net storage savings will depend upon a database’s current usage of B-tree indexes:

Bitmapped indexes are best for read-only or light OLTP environments. Because there is no effective method for locking a single bit, row-level locking is not available for bitmapped indexes. Instead, locking for bitmapped indexes is effectively at the block level which can impact heavy OLTP environments. Note also that like other types of indexes, updating bitmapped indexes is a costly operation.

Though bitmapped indexes are not appropriate for databases with a heavy load of insert, update, and delete operations, their effectiveness in a data warehousing environment is not diminished. In such environments, data is usually maintained via bulk inserts and updates. For these bulk operations, rebuilding or refreshing the bitmapped indexes is an efficient operation. The storage savings and performance gains provided by bitmapped indexes can provide tremendous benefits to data warehouse users.

Performance/storage characteristics

In preliminary testing of bitmapped indexes, certain queries ran up to 100 times faster. The bitmapped indexes on low cardinality columns were also about ten times smaller than B-tree indexes on the same columns. In these tests, the queries containing multiple predicates on low-cardinality data experienced the most significant speedups. Queries that did not conform to this characteristic were not assisted by bitmapped indexes.

Example scenarios

The following sample queries on the CUSTOMERS table demonstrate the variety of query-processing techniques that are necessary for optimal performance.

   Example #1: Single predicate on a low-cardinality attribute.

select * from customers
 where gender = 'male';

Best approach: parallel table scan

This query will return approximately 50% of the data. Since we will be accessing such a large number of rows, it is more efficient to scan the entire table rather than use either bitmapped indexes or B-tree indexes. To minimize elapsed time, the Server should execute this scan in parallel.

   Example #2: Single predicate on a high-cardinality attribute.

select * from customers
 where customer# = 101;

Best approach: conventional unique index

This query will retrieve at most one record from the employee table. A B-tree index or hash cluster index is always appropriate for retrieving a small number of records based upon criteria on the indexed columns.

   Example #3: Multiple predicates on low-cardinality attributes

select * from customers
 where gender = 'male'
   and region in ('central','west')
   and marital_status in ('married', 'divorced');

Best approach: bit-mapped index

Though each individual predicate specifies a large number of rows, the combination of all three predicates will return a relatively small number of rows. In this scenario, bitmapped indexes provide substantial performance benefits.

   Example #4: Multiple predicates on both high-cardinality and low-cardinality attributes.

select * from customers
 where gender = 'male'
   and customer# < 100;

Best approach: B-tree index on CUSTOMER#

This query returns a small number of rows because of the highly selective predicate on CUSTOMER#. It is more efficient to use a B-tree index on CUSTOMER# than to use a bitmapped index on GENDER.

In each of the previous examples, the Oracle cost-based optimizer transparently determines the most efficient query-processing technique. Received on Thu Nov 21 1996 - 00:00:00 CST

Original text of this message

HOME | ASK QUESTION | ADD INFO | SEARCH | E-MAIL US