**Answer:** You
are correct that the maximum cardinality for bitmap indexes is a function of
the number of distinct values and the size of the key, but far and away it
is the number of distinct values that is the largest contributing factor to
the maximum number of distinct values you can have in a bitmap index.

As the number if distinct values increases, the size of
the bitmap increases exponentially, such that an index with 100 values may
perform thousands of times faster than a bitmap index on 1,000 distinct
column values. The answer, as with all Oracle tuning questions is "It
depends".

Also, remember that bitmap indexes are only suitable
for static tables and materialized views which are updated at nigh and
rebuilt after batch row loading. If your tables are not read-only
during query time, DO NOT consider using bitmap indexes!

So, which is the right maximum
values for your bitmap index? Who knows? You will need to run
performance benchmarks on your unique database to see!

Benchmarks have shown these trends in SQL performance and the number of
distinct bitmap index values (as a percentage of total table rows), and the
clustering factor for the index column. All else being equal, cardinality is
one of the most important factors in deciding to use a bitmap index, but
there are rare cases where a bitmap on a column with 10,000 key values might
be appropriate.

So, what is the maximum threshold for unique values in a bitmap index?
The answer (as with all Oracle tuning questions) is "It depends". To
understand how a bitmap works, remember that a bitmap is a two dimensional
matrix with two axis. The number of rows in the table is one axis, and
the number of distinct key values is the other axis.

*Conceptualize a bitmap index as a two-dimensional array*

Hence, a bitmap index on a million row table with ten distinct values in
the index column will have ten million cells in the bitmap.
Conversely, a bitmap on a million rows table with an index column with 1,000
distinct values will be much larger, with a billion cells.

At
this point, let’s note that Oracle performs bitmap index compression, and
that is why the clustering_factor is important. If a bitmap index
column has a “good” clustering factor (e.g. close to the number of blocks in
the table) then the index values are adjacent, and Oracle will be able to
compress the index far better than a bitmap on an un-clustered column.

# Bitmap compression, row clustering, and column
cardinality

It should be clear that it’s not exclusively the number of distinct
values in the index that governs our choice of using a bitmap, it’s also a
function of the percentage the distinct values as a percentage of rows in
the table and the clustering of the column values (as per the
clustering_factor column in the dba_indexes view). Hence, a decision
for a bitmap is a function of these criteria:

**The ratio of rows to distinct values: ** A more
accurate estimate of the suitability of a bitmap index is the ratio of
distinct rows to the number of total rows in a table.

**Clustering factor: ** The clustering of the
index keys to the physical data blocks has an influence on bitmap
indexes.

In a table where the bitmap columns appear in the same physical order as
the rows, related column values will be adjacent on each data block.
The clustering_factor will be close to the value of the blocks column in *
dba_data_files*, and Oracle will be able to compress the bitmap index.

When a bitmap index sees repeating column values, the matching rows can
be omitted from the bitmap. As a bitmap is being created or updated,
Oracle looks to the previous index entry. If it is the same, Oracle
replaces the bitmap entry with a “same as prior” flag, and continues until a
new key value is detected.

At an extreme case, consider a
million rows table where the column has 100,000 distinct values and
clustering_factor ~= blocks.

Because all adjacent columns values are
grouped together, the bitmap axis goes from 1,000,000 rows down to 100,000,
making for a highly compressed bitmap.

In the compressed example
below, all “East:” rows appear adjacent in rows 1-5, all “North” values
resides on rows 6-13, all “South values” are in rows 14-15, and all “West”
rows reside on rows 16-18:

Again, in most cases, there will not be a lot of adjacent index values,
so it quite rare to see extensive compression.