Oracle index rebuilding:
Indexes to rebuild script
Oracle Tips by Burleson
Executive Summary on index rebuilding
While it may be rare to rebuild an Oracle index
for performance reasons, there are some databases that will get
a measurable performance boost from rebuilding indexes.
These workloads have these characteristics:
fragmentation: The SQL workload has lots of
table DML causing lots of deleted leaf blocks.
High index scan
access plans: The SQL workload is rich with
index scans (index fast-full scans and index range scans)
The rules for identification of candidates for index coalescing/rebuilding
depend on your specific index state. See MOSC notes
989186.1, 122008.1, 989093.1 for Oracle's suggestions on when to
coalesce/rebuild indexes. Also see my
updated notes on index coalesce/rebuilding and note that this DBA has demonstrated the
characteristics indexes that benefits from
scheduled oracle index rebuilding.
3661285, the Oracle 11g segment advisor now detects sparse
indexes and automatically flags them for rebuilding.
How rare are "bad" indexes?
You cannot generalize to say that index
rebuilding for performance is rare, or even medium rare, it
depends on many factors, most importantly the characteristics of
In scientific applications (clinical, laboratory) where
large datasets are added and removed, the need to rebuild
indexes is "common".
Conversely, in system that never update or delete rows,
index rebuilding rarely improves performance.
In systems that do batch DML jobs, index rebuilding "often"
improves SQL performance.
Oracle MOSC note
122008.1 for the officially authorized script to detect
indexes that benefit from rebuilding. This script detects
indexes for rebuilding using these rules:
Rebuild the index when
these conditions are true:
- deleted entries represent 20% or more of the current entries.
- the index depth is more then 4 levels.
I also have more
sophisticated index rebuilding scripts in my book "Oracle
Tuning: The Definitive Reference", with a code depot of downloadable
Script to rebuild indexes
It is very difficult to write a script that
will identify indexes that will benefit from rebuilding because
it depends on how the indexes are used. For example,
indexes that are always accessed vis an index unique scan" will
never need rebuilding, because the "dead space" does not
interfere with the index access.
Only indexes that have a high number of
deleted leaf blocks and are accessed in these ways will benefit
index fast full scan
index full scan
index range scan
Getting statistically valid: proof from a volatile production system would be a
phenomenal challenge. In a large production system, it would be a massive
effort to trace LIO from specific queries to specific indexes before and after
When rebuilding fragmented (indexes with
deleted leaf nodes) we will expect the space to be reduced:
col segment_name format a40
segment_type = 'INDEX'
owner = 'xxx';
Normally, these indexes attempt to manage
themselves internally to ensure fast access to the data rows.
However, excessive activity within a table can cause Oracle
indexes to dynamically reconfigure themselves. This
reconfiguration involves three activities:
Index spawning At some point,
the Oracle indexes will reach the maximum capacity for the
level and the Oracle index will spawn, creating a deeper
Index node deletion As you may know, Oracle
index nodes are not physically deleted when table rows are
deleted, nor are the entries removed from the index. Rather,
Oracle 'logically' deletes the index entry and leaves dead
nodes in the index tree.
Indexes require rebuilding when deleted
leaf nodes appear or when the index has spawned into too many
levels of depth. While it is tempting to write a script that
rebuilds every index in the schema, bear in mind that Oracle
contains many thousands of indexes, and a complete rebuild can
be very time consuming. Hence, we need to develop a method to
identify those indexes that will get improved performance with a
rebuild. Let's look at a method for accomplishing this task.
One vexing issue with Oracle indexes is
that the information for an index rebuild must be gathered from
ANALYZE INDEX index_name COMPUTE STATISTICS
ANALYZE INDEX index_name VALIDATE STRUCTURE
Once we gather information from these sources, we
can generate a report with everything we need to know about the
index internal structure:
# rep dist.
leaf rows Height per
----- -------- ------
From this report, we see several important
The number of deleted leaf nodes The
term 'deleted leaf node' refers to the number of index nodes
that have been logically deleted as a result of row
delete operations. Remember that Oracle leaves dead
index nodes in the index when rows are deleted. This is done
to speed up SQL deletes, since Oracle does not have to
rebalance the index tree when rows are deleted.
Index height The height of the
index refers to the number of levels that are spawned by the
index as a result in row inserts. When a large number of
rows are added to a table, Oracle may spawn additional
levels of an index to accommodate the new rows. Hence, an
Oracle index may have four levels, but only in those areas
of the index tree where the massive inserts have occurred.
Oracle indexes can support many millions of entries in three
levels, and any Oracle index that has four or more levels
would benefit from rebuilding.
Gets per index access The number
of gets per access refers to the amount of logical I/O
that is required to fetch a row with the index. As you may
know, a logical get is not necessarily a physical I/O since
much of the index may reside in the Oracle buffer cache.
The script to make the report is complex
because both the validate structure and compute
statistics commands must be executed. The following code
snippets are used to generate the report.
This script for experts only.
It is extremely resource intensive. DO NOT run it in an
active production environment. Only run this script during
published this Korn shell script to identify highly fragmented
indexes, targeting indexes that are 4 levels or more, or indexes
with a high number of deleted leaf blocks:
$ORACLE_HOME/sqlplus -s User1/pwd << EOF
set echo off
set termout off
set trimspool on
set heading off
set lines 300
set pages 0
set serverout on
DBMS_STATS.UNLOCK_TABLE_STATS ('''|| user ||''','''|| table_name
||''');' from user_tables order by table_name asc;
in ( select index_name from user_indexes where index_type =
dbms_output.put_line('ANALYZE INDEX ' || x.index_name || '
dbms_output.put_line('ANALYZE INDEX ' || x.index_name || '
dbms_output.put_line('select name, height, lf_rows, del_lf_rows,
round((del_lf_rows/lf_rows)*100,2) as ratio from index_stats
where (lf_rows > 100 and del_lf_rows
and (height > 3 or ((del_lf_rows/lf_rows)*100)
select 'exec DBMS_STATS.LOCK_TABLE_STATS
('''|| user ||''','''|| table_name ||''');' from user_tables
order by table_name asc;
column name format a40
PROMPT NAME | HEIGHT | LF_ROWS |
DEL_LF_ROWS | RATIO (del_lf_rows/lf_rows) %
There are many
compelling reasons to manage indexes within Oracle. In an OLTP
system, index space is often greater than the space allocated for
tables, and fast row data access is critical for sub-second
response time. As we may know, Oracle offers a wealth of index
is the standard tree index that Oracle has been using since the earliest
Bitmap indexes are used where an index column has a relatively small number of
distinct values (low cardinality). These are super-fast for read-only
databases, but are not suitable for systems with frequent updates
This is an index structure whereby data columns from other tables appear in a
multi-column index of a junction table. This is the only create index syntax
to employ a SQL-like from clause and where clause.
parts.part_type, supplier.state )
addition to these index structures we also see interesting use of indexes at
runtime. Here is a sample of index-based access plans:
This row access method scans an index to collect a series of ROWID?s.
fast-full-scans: This is a: multi-block read access where the index blocks are accessed via
a: db file scattered read to load index blocks into the buffers. Please note
that this method does not read the index nodes.
joins: The star
index has changed in structure several times, originally being a
single-concatenated index and then changing to a bitmap index implementation.
STAR indexes are super-fast when joining large read-only data warehouse
This is an example of the use of the index_combine hint. This
execution plan combines bitmap indexes to quickly resolve a low-cardinality
select /*+ index_combine(emp, dept_bit, job_bit) */
job = 'SALESMAN'
deptno = 30
Here is the execution plan that shows the index
OPTIONS OBJECT_NAME POSITION
------------------------------ ---------------------------- ----------
BY INDEX ROWID EMP 1
TO ROWIDS 1
SINGLE VALUE DEPT_BIT 1
SINGLE VALUE JOB_BIT 2
the internals of Oracle indexing are very complex and open to debate, there are
some things that you can do to explore the internal structures of your indexes.
Let?s take a closer look at the method that I use to reveal index structures.
Oracle index rebuilding scripts
Ken Adkins, a respected Oracle author, notes that it is
often difficult to pinpoint the exact reason that indexes
benefit from a rebuild:
"The DBAs were pulling out their hair
until they noticed that the size of the indexes were too
large for the amount of data in the tables, and remembered
this old: myth?, and decided to try rebuilding the indexes
on these tables.
The DELETE with the multiple NOT EXISTS went from running for 2 hours to delete 30,000 records, to deleting over 100,000 records in minutes. Simply by rebuilding the indexes?."
An Oracle ACE notes this script to rebuild his
indexes. "I eventually wrote a simple query that generates
a list of candidates for index rebuilds, and the commands
necessary to rebuild the indexes once the tables reached a point
where it was necessary. The query reads a table we built called TABLE_MODIFICATIONS that we loaded each night from
DBA_TAB_MODIFICATIONS before we ran statistics. Monitoring must
be turned on to use the DBA_TAB_MODIFICATIONS table."
from analyzedb.table_modifications t,
where t.timestamp >= to_date('&from_date','dd-mon-yyyy') and
t.table_owner = a.owner and t.table_owner not in ('SYS','SYSTEM')
group by t.table_owner, t.table_name, a.num_rows
order by num_rows desc, t.table_owner, t.table_name;
The Debate Continues
battle is raging between the: academics? who do not believe that indexes should
be rebuilt without expensive studies, and the: pragmatists? who rebuild indexes
on a schedule because their end-users report faster response times.
none of the world's Oracle experts has determined a reliable rule for index
rebuilding, and no expert has proven that index rebuilds "rarely" help.
approach - Many
Oracle experts claim that indexes rarely benefit from rebuilding, yet none has
ever provided empirical evidence that this is the case, or what logical
I/O conditions arise in those: rare? cases where indexes benefit from
Pragmatic approach: Many IT managers force their Oracle DBAs to periodically rebuild indexes
because the end-user community reports faster response times following the rebuild. The pragmatists are not interested in: proving? anything, they are
just happy that the end-users are happy. Even if index rebuilding were to be
proven as a useless activity, the Placebo Effect on the end-users is enough to
justify the task.
clear that all 70 of the index metrics interact together in a predictable way.
Some scientist should be able to take this data and reverse-engineer the
internal rules for index rebuilding, if any actually exist. For now, the most
any Oracle professional can do is to explore their indexes and learn how the
software manages b-tree structures.
When can we "prove" a benefit from an index
rebuild? Here, Robin Schumacher
proves that an index that is rebuilt in a larger tablespace will
contain more index entries be block, and have a flatter structure:
"As you can see, the amount of logical reads has
been reduced in half simply by using the new 16K tablespace and accompanying
16K data cache."
In an OracleWorld 2003 presentation titled: Oracle Database 10g: The Self-Managing
Database by Sushil Kumar of Oracle Corporation, Kumar
states that the new Automatic Maintenance Tasks (AMT) Oracle10g
feature will "automatically detect and re-build sub-optimal
indexes. However, the 10g segment advisor only recommends index
rebuilding from a space reclamation perspective, not for performance.
Kim Floss article shows the Oracle 10g segment advisor recommending a
rebuild of an index:
'The page lists all the segments
(table, index, and so on) that constitute the object under review. The
default view ("View Segments Recommended to Shrink") lists any segments that
have free space you can reclaim.'
Oracle index rebuild advisor (Source: Oracle Corporation)
Inside Oracle b-tree
are many myths and legends surrounding the use of Oracle indexes, especially the
ongoing passionate debate about rebuilding of indexes for improving
performance. Some experts claim that periodic rebuilding of Oracle b-tree
indexes greatly improves space usage and access speed, while other experts
maintain that Oracle indexes should: rarely be rebuilt. Interestingly, Oracle
reports that the new Oracle10g Automatic Maintenance Tasks (AMT) will
automatically detect indexes that are in need of re-building. Here are the
and cons of this highly emotional issue:
Arguments for Index Rebuilding: Many Oracle shops schedule periodic index rebuilding, and report measurable
speed improvements after they rebuild their Oracle b-tree indexes. In an
OracleWorld 2003 presentation titled
Oracle Database 10 g: The Self-Managing Database by Sushil Kumar of Oracle
Corporation, Kumar states that the Automatic Maintenance Tasks (AMT) Oracle10g
feature will automatically detect and rebuild sub-optimal indexes. : AWR
provides the Oracle Database 10g a very good 'knowledge' of how it is being
used. By analyzing the information stored in AWR, the database can identify
the need of performing routine maintenance tasks, such as optimizer statistics
refresh, rebuilding indexes, etc. The Automated Maintenance Tasks
infrastructure enables the Oracle Database to automatically perform those
Arguments against Index Rebuilding: Some Oracle in-house experts maintain that Oracle indexes are
super-efficient at space re-use and access speed and that a b-tree index
rarely needs rebuilding. They claim that a reduction in Logical I/O (LIO)
should be measurable, and if there were any benefit to index rebuilding,
someone would have come up with: provable? rules.
is right? I suspect that they both are correct. There is a huge body of
evidence that index rebuilding makes the end-users report faster response time,
and I have to wonder if this is only a Placebo Effect, with no scientific
basis. Some experts suspect a Placebo Effect may be at work here, and the
end-users, knowing that they have new index trees, report a performance gain
when none exists.
extreme flexibility, Oracle b-tree indexes are quite complex, and to really
gather scientific evidence we must examine all of the index metrics. Getting a
meaningful measure of the benefit of an index rebuild in a production
environment is very difficult because the system is under heavy load and usage
patterns change constantly. Plus, many IT managers require periodic index
re-building because it is a low-risk job and it curries favor from the end-user
Where are the index
Oracle professionals are aware of the dba_indexes view, which is populated
with index statistics when indexes are analyzed. The dba_indexes view
contains a great deal of important information for the SQL optimizer, but there
is still more to see. Oracle provides an analyze index xxx validate
structure command that provides additional statistics into a temporary
tables called index_stats, which, sadly, is overlaid after each command.
the full picture, we must devise a table structure that will collect data from
both sources. Here is a method that will do the job:
temporary table to hold data from dba_indexes and index_stats
quality of dbms_stats analysis
temporary table from dba_indexes
indexes and send output into temp table
the index_stats information is very time consuming and expensive and will
introduce serious locking issues on production databases. It is strongly
recommended that you perform these checks during scheduled downtime, or on a
representative test database.
10g note: Tested against Oracle10g
10.2.0.1.0 on Windows XP SP2, and found
Need to be defined in the table index_details, and this crashes the subsequent
table population and procedure.
start by creating a table to hold our index data. I call this table
drop table index_details;
Create table index_details
-- ********* The following is from dba_indexes ******************
-- ********* The following is from index_stats ******************
storage (initial 5k next 5k maxextents unlimited);
(Note: the index_stats table has a column named PCT_USED even though Oracle
indexes do not allow changes to this value.)
we have a table that will hold all of the index details, the next step is to
populate the table with data from freshly-analyzed indexes. Remember, you
should always run dbms_stats to get current index statistics. Here is
insert into index_details
select * from dba_indexes
where owner not like 'SYS%'
we have gathered the index details from dba_indexes, we must loop through
iterations of the analyze index xxx validate structure command to populate our
table with other statistics. Here is the script that I use to get all index
/* INDEX.STATS contains 1 row from last execution */
/* of ANALYZE INDEX ... VALIDATE STRUCTURE */
/* We need to loop through validates for each */
/* index and populate the table. */
cursor idx_cursor is
select owner_name, index_name from index_details;
for c_row in idx_cursor loop
v_dynam := 'analyze index '||c_row.owner_name||'.'||c_row.index_name||
' validate structure';
execute immediate v_dynam;
update index_details set
(HEIGHT, BLOCKS, NAMEx, PARTITION_NAME, LF_ROWS, LF_BLKS,
LF_ROWS_LEN, LF_BLK_LEN, BR_ROWS, BR_BLKS, BR_ROWS_LEN,
BR_BLK_LEN, DEL_LF_ROWS, DEL_LF_ROWS_LEN, DISTINCT_KEYSx,
MOST_REPEATED_KEY, BTREE_SPACE, USED_SPACE, PCT_USED,
ROWS_PER_KEY, BLKS_GETS_PER_ACCESS, PRE_ROWS, PRE_ROWS_LEN,
= (select * from index_stats)
where index_details.owner_name = c_row.owner_name
and index_details.index_name = c_row.index_name;
if mod(idx_cursor%rowcount,50)=0 then
a.owner_name = b.table_owner
a.index_name = b.index_name
running the script from listing 3, we should now have complete index details for
any index that we desire. However with 70 different metrics for each index, it
can be quite confusing about which columns are the most important. To make
queries easy, I create a view that only displays the columns that I find the
most interesting. Here is my view.
drop view indx_stats;
Create view idx_stats
most of these column descriptions are self-evident, there are some that are
CLUSTERING_FACTOR: This is one of the most important index statistics because it indicates how
well sequenced the index columns are to the table rows. If
clustering_factor is low (about the same as the number of
dba_segments.blocks in the table segment) then the index key is in the
same order as the table rows and index range scans will be very efficient, with
minimal disk I/O. As clustering_factor increases (up to
dba_tables.num_rows), the index key is increasingly out of sequence with
the table rows. Oracle?s cost-based SQL optimizer relies heavily upon
clustering_factor to decide whether to use the index to access the
- As an
index accepts new rows, the index blocks split. Once the index nodes have
split to a predetermined maximum level the index will: spawn? into a new
BLOCKS: This is the
number of blocks consumed by the index. This is dependent on the
db_block_size. In Oracle9i and beyond, many DBAs create b-tree indexes in
very large blocksizes (db_32k_cache_size) because the index will spawn less.
Robin Schumacher has noted in his book
Oracle Performance Troubleshooting notes: As
you can see, the amount of logical reads has been reduced in half simply by
using the new 16K tablespace and accompanying 16K data cache. Clearly, the
benefits of properly using the new data caches and multi-block tablespace
feature of Oracle9i and above are worth your investigation and trials in your
metric is very misleading because it looks identical to the dba_tables pct_used column, but has a different meaning. Normally, the
threshold is the freelist unlink threshold, while in index_stats
pct_used is the percentage of space allocated in the b-tree that is being
Is there a criterion
for index rebuilding?
believe the anecdotal reports that index rebuilding improved end-user reported
performance, how can we analyze this data and see what the criteria (if any)
might be for an index rebuild?
For example, here are the criteria used by
a fellow Oracle DBA who swears that rebuilding indexes with these criteria has a
positive effect on his system performance:
Only consider when space used is more than 1 block ***
btree_space > 8192
-- *** The number of index levels is > 3 ***
(height > 3
-- *** The % being used is < 75%
or pct_used < 75
-- *** Deleted > 20% of total
or (del_lf_rows/(decode(lf_rows,0,1,lf_rows)) *100) > 20)
reality I suspect that the rules are far more complicated than this simple
formula. To see the commonality between indexes of similar nature you can use
the data from your new index_details table to write summary queries
(Listing 5). Here we see the average number of index blocks, leaf rows and leaf
blocks for indexes of different heights.
gives us a high-level idea of Oracle threshold for spawning an index onto new
levels. We can take this same approach and attempt to answer the following
1 - At what point does an index spawn to another level
(height)? It should be a function of blocksize, key length and the number of
2 - The number of deleted leaf nodes may not be enough to
trigger an index rebuild. This is because if clustering_factor is low (dba_indexes.clustering_factor
~= dba_segments.blocks), then the rows are added in order, and the index
is likely to reuse the deleted leaf nodes. On the other hand, if
clustering_factor is high (dba_indexes.clustering_factor ~=
dba_tables.num_rows), and the majority of queries use the index with
fast-full scans or index range scans, then a rebuild of the underlying table (to
rows) may be beneficial.
To illustrate, assume I have an index on the last_name column of a
1,000,000 row table and the clustering_factor is close to the number of
blocks, indicating that the rows are in the same sequence as the index.
In this case, a bulk delete of all people whose
last_name begins with the letter: K? would leave a dense cluster of deleted
leaf nodes on adjacent data blocks within the index tablespace. This large
section of space is more likely to be reused than many tiny chunks.
also use the data from our detail table to compute our own metrics. In the
example query below, we create a meta-rule for indexes:
Full Block Space
- This is the index key space (number of table rows * index key length) as a
function of the blocksize and free index space.
of Free Blocks -
This is the estimated number of free blocks within the index.
these metrics, we can analyze the system and produce some very interesting
reports of index internals:
col c1 heading 'Average|Height' format 99
col c2 heading 'Average|Blocks' format 999,999
col c3 heading 'Average|Leaf|Rows' format 9,999,999,999
col c4 heading 'Average|Leaf Row|Length' format 999,999,999
col c5 heading 'Average|Leaf Blocks' format 9,999,999
col c6 heading 'Average|Leaf Block|Length' format 9,999,999
Average Average Average
Average Average Leaf Leaf Row Average Leaf Block
Height Blocks Rows Length Leaf Blocks Length
------- -------- -------------- ------------ ----------- ----------
1 236 12 234 1 7,996
2 317 33,804 691,504 106 7,915
3 8,207 1,706,685 41,498,749 7,901 7,583
4 114,613 12,506,040 538,468,239 113,628 7,988
see, we can compute and spin this data in an almost infinite variety of ways.
Reader Comments on index rebuilding:
It has been my experience that the percentage of deletes
required to affect performance actually goes down as the table gets larger,
perhaps because the I/O effect is magnified with more data.
At my previous job, our observation was that on 20 million
row tables and upward as little as 5% deletes, regardless of other factors,
would cause sufficient performance degradation to justify an index rebuild.
Admittedly this was a site with fairly tight budget constraints, so we couldn't
afford absolute, top of the line hardware. We also didn't have the luxury of
making sure every tablespace was ideally placed for performance, but we did a
pretty good job with what we had. I actually wrote a script that would
calculate the % deletes and generate the index rebuild command.
Also, there is the rebuild online option for indices, which does work but it
will cause noticeable performance degradation if you try and do it during busy
times. I believe it was available in 8i, but it would generate ora-600 errors
when used on busy tables in 8i.
See my related notes on index rebuilding:
10g usage note:
Tested against Oracle10g 10.2.0.1.0 on
Windows XP SP2, and found
not defined in the table index_details, and this crashes the subsequent table
population and procedure.
If you like Oracle tuning, see the book "Oracle
Tuning: The Definitive Reference", with 950 pages of tuning tips and
You can buy it direct from the publisher for 30%-off and get
instant access to the code depot of Oracle tuning scripts.