| |
 |
|
The latest consensus on index rebuilding
Oracle Tips by Burleson Consulting
November 2, 2007 Updated January 2009
|
Over the years, there have been heated discussions about
the benefits of scheduled rebuilding of Oracle indexes. In this great
discussion on OTN we see some excellent observations about Oracle indexes
and it highlights the common misconceptions from Oracle experts who have never
managed any real-world databases.
Also see Oracle ACE Andy Kerber's script for detecting when
indexes will benefit from rebuilding.
A script to detect when
indexes require rebuilding.
How rare are "bad" indexes?
You cannot generalize to say that index
rebuilding for performance is common or rare, it depends on many
factors, most importantly the characteristics of the
application.
-
In scientific applications (clinical, laboratory) where large datasets are
added and removed, the need to rebuild indexes is "common".
-
Conversely, in system that never update or delete rows, index rebuilding
rarely improves performance.
-
In systems that do batch DML jobs, index rebuilding "often" improves SQL
performance.
It's well established that the "logical delete" features of
Oracle indexes creates index fragmentation when the base table experiences
updates and deletes, and regularly scheduled index rebuilding can have dramatic
improvements in SQL performance. From the O'Reilly book Oracle SQL Tuning
Pocket Reference, Page 66, we see some good advice from Mark Gurry:
“If index columns are frequently updated, you should also
re-build the indexes, because an update forces a logical delete in the index
followed by an insert of the new, updated entry.
Some sites go as far as rebuilding indexes nightly when they
have a lot of logical delete activity.”
Per bug 3661285, the Oracle 11g segment advisor now detects
sparse indexes and automatically flags them for rebuilding.
Ceteris Parabus, SQL response time for full-scan and
logically-sequential range scans is a function of the density of the index
block. Index blocks that are made "sparse" as a result of delete activity
will cause additional I/O.
Index rebuilding is risk
free (when done properly, moving the indexes between two tablespaces), and on
OTN, DBA Chen Shapira
reports that an index
rebuild significantly reduced index range scan I/O:
"We noticed no change of the
execution plan after rebuild. The plan was exactly the same, but an index
scan that required lots of block reads and was very costly, now takes
significantly less."
Here are some generally-accepted observations about Oracle indexes.
(Note: By their very nature, all observations cannot be validated without
exposing proprietary details. Hence, make sure to check the credentials
and qualifications of all claims.)
- Index rebuilds can improve SQL performance - On indexes
with heavy delete activity, rebuilding has been proven to improve SQL
performance for range queries. Andrew Kerber notes "our observation
was that on 20 million row tables and upward, as little as 5% deletes,
regardless of other factors, would cause sufficient performance degradation
to justify an index rebuild."
- Index rebuilds can be safely automated - For shops that are concerned about space reclamation, batch jobs can be
implemented to reliably rebuild all indexes automatically. I've personally
used crontab jobs for index rebuilding when my client's insisted on scheduled
index rebuilding, and there is a very low risk of causing a production problem,
provided that the rebuild are performed during regularly scheduled maintenance
windows using
Oracle Best Practices.
-
Index rebuilds can release free space - It's well documented that index rebuilding/coalesce will
release lost space (see the Oracle Segment Advisor) and in "rare cases", an
index rebuild may cause a performance boost, but only under certain conditions.
-
Blocksize matters - It's generally accepted that a larger blocksize results in
faster full scans, as noted in this Hewlett
Packard whitepaper titled "Backup
and recovery best practices for an ultra-large Oracle database".
It
concluded that:
"Each of the four EVA virtual disks contained a
VxFS file system. One file system was used to back up each of the VGs. Using
a large block size yielded best performance; however, it is least efficient
for space utilization"
Authoritative Oracle recommendations on index rebuilding
See
Oracle MetaLink note
122008.1 for the officially authorized script to detect
indexes that benefit from rebuilding. This script detects
indexes for rebuilding using these rules and recommends a index
rebuild when:
- deleted entries represent 20% or more of the current entries.
- the index depth is more then 4 levels.
Oracle's
index rebuilding guidelines appear in
Metalink note 77574.1 (dated April 2007) recommends that
indexes be periodically examined to see if they are candidates
for an index rebuild. The note suggests that when indexes
have skewed values, certain parts of the index are accessed more
frequently than others, creating disk contention and poor query
performance.
The note also says that the
index rebuilding criteria has changed since the advent of Oracle9i, where a
blevel > 4 was a good threshold. It also notes that the pct_deleted
column in index_stats from “alter index xxx validate structure” provides
a guideline for index rebuilding candidates:
“Prior to 9i, if the BLEVEL is more than 4, it is
recommended to rebuild the index. . .
The PCT_DELETED column shows what percent of leaf
entries (index entries) have been deleted and remain unfilled.
The more deleted entries exist on an index, the more
unbalanced the index becomes. If the PCT_DELETED is 20% or higher, the index
is candidate for rebuilding.”
Metalink Note:46757.1 titled "Notes
on Choosing an Optimal DB BLOCK SIZE" says that there are some tangible
benefits to using a larger blocksize:
- Using bigger blocks means more data transfer per I/O
call; this is an advantage since the cost of I/O setup dominates the cost of
an I/O. . .
- Using bigger blocks means more space for key storage
in the branch nodes of B*-tree indexes, which reduces index height, which
improves the performance of indexed queries.
- Using a block size that is k times bigger than your current one will save
you (k-1)f/(kb-f) bytes of space for large segments, where f is the size of
a block's fixed block header (61 bytes for tables, 57+4n for n-table
clusters, 113 for indexes).
- When using large block there are less probability of chained and migrated
rows, which in turn reduced the number of reads required to get the
information.
Other evidence on index rebuilding
Ken Adkins, a respected Oracle author, notes that it is often difficult to
pinpoint the exact reason that indexes benefit from a rebuild:
“The DBAs were pulling out their hair until they
noticed that the size of the indexes were too large for the amount of data
in the tables, and remembered this old “myth”, and decided to try rebuilding
the indexes on these tables.
The DELETE with the multiple NOT EXISTS went from
running for 2 hours to delete 30,000 records, to deleting over 100,000
records in minutes. Simply by rebuilding the indexes….”
Robin Schumacher proved that indexes which are rebuilt in a
large blocksize have a "flatter structure" and contain more index keys per
block, facilitating index fast-full scans and index range scan performance.
The benefits of large blocksizes are demonstrated on this
OTN thread where we see a demo showing 3x faster performance using a larger
block size.
According to an
article by Christopher Foot, author of the
OCP Instructors Guide for Oracle DBA Certification, larger block sizes
can help in certain situations:
"A bigger block size means more space for
key storage in the branch nodes of B-tree indexes, which reduces index
height and improves the performance of indexed queries."
In any case, there appears to be evidence
that block size affects the tree structure, which supports the argument that
the database blocksize effects the structure of the tree.
DDL changes an index internal structure
It's also proven that indexes which are subjected to heavy
batch DML will change from their pristine state, especially when the batch jobs
alter the index keys.
Note:
I use the term "non-pristine"
because there is a discussion of the meaning of a "fragmented" or
"unbalanced" index. While Oracle indexes always remain "logically
balanced", the physical sequence of the index block may not match the
physical structure.
Also, "non-pristine" refers to indexes with "white
space" within the index that arises when an index block is not densely
packed because of delete activity.
However, a "non-pristine" index will only facilitate slower
performance when the workload contains queries that perform large range scans or
index fast full scans. Indexes that do single row fetches will never see
degraded response time.
The Oracle 10g segment advisor recommends indexes for
rebuilding from a space reclamation perspective, but the debate rages on about
how to identify those indexes that will benefit from a rebuild.
It should be noted that most OLTP systems will not have the
right conditions to see a performance benefit from index rebuilding, but systems
with frequent batch DML will see the conditions far more frequently.
Hence, DBA's who never experienced these types of systems will have a very
different frame of reference than those DBA's who must be vigilant for
increasing response times.
What types of applications have index rebuilding
opportunities?
Some people claim that system that get a performance
benefit from index rebuilding are "rare", and the term “rarity” is in the eyes
of the beholder.
In some super-dynamic applications (i.e. Pharmaceuticals,
scientific research applications), it’s not rare at all to get a benefit from
index rebuilding. These types of system were one reason that 10g introduced
dynamic sampling. A table can be huge in the AM and tiny on the afternoon, and
skewed values change with each batch load.
While I can only speak from my own experiences, but I've
noted that systems that are characterized by large batch loads, updates and
purges are the most likely to have indexes. These include many ERP
products such as SAP, especially M&I systems (Oracle INV and MFG modules), but
only where raw materials and shipped goods are updated via batch jobs (i.e. EDI
feeds).
Also problematic are scientific applications, especially
LIMS systems (Laboratory Information Management Systems), which constantly load,
modify and purge large sets of experimental data.
For example, consider the popular Clintrial software, a
LIMS system that uses Oracle. In Clintrial, experimental results are initially
stored as small rows. Later, as more data is collected, the VARCHAR columns
expand, causing massive row chaining, and fragmenting the daylights out of
secondary indexes, as key values are changed. In the Clintrial databases I’ve
worked with, most data updates are done via large batch loads from SAS, and as a
typical DSS, the queries tend to do lots of large range scans, aggregations and
index full-scans.
To plan for this expected growth, we set PCTFREE to allow only a few rows per
block, thereby avoiding the chained row issue. The problem, of course, is having
a-priori knowledge of how the application will process it’s rows. In my world,
the scientists start complaining about slow response times, and while I’ve
alleviated the row chaining issue, the indexes still needed to be rebuilt.
In practice, many of these shops drop the indexes before
the batch jobs and rebuild them again afterwards, but some choose to perform
periodic index maintenance.
Finding index candidates
The Oracle segment advisor is one way to identify
candidates for an index rebuild.
The 10g Automatic Maintenance Tasks (AMT) feature will "automatically detect and re-build sub-optimal
indexes.“
This
Kim Floss article shows the Oracle 10g segment advisor
recommending a rebuild of an index:
“The page lists all
the segments (table, index, and so on) that constitute the
object under review.
The default view ("View Segments
Recommended to Shrink") lists any segments that have free space
you can reclaim.”

Oracle index rebuild advisor (Source: Oracle Corporation)
Using Oracle's model, the DBA compares allocated space
to used space to locate "sparse" indexes, and many DBA's have postulated
equations and
formulae for identifying "sub-optimal" indexes.
But when rebuilding for performance, the DBA must also
inspect the workload to find reproducible evidence of "unnecessary I/O" (e.g.
too many blocks read for an index range scan or an index fast-full-scan).
Oracle’s segment advisor has built primitive predictive
models to “suggest” reorganization opportunities, but like any DSS, the rules
are not fully quantifiable. The DSS only supplies the well-structured
quantitative component, but it’s still a semi-structured problem, requiring a
real human expert to make the right decision.
The question remains, can an
automated tool inspect the workload and the indexes and make a rebuild
recommendation? I say “yes”, but it can never be fully-automated without human
input. Let's take a closer look.
An empirical approach to verifying the benefits of an index
rebuild
No two systems are alike, and the method used by the DBA
are as varied as the applications themselves.
I suggest that well-structured rules exist for index
rebuilding, but they are somewhat complex and require empirical evidence about:
1 - Specific conditions within the real-world workload
(too-long IRS and IFFS)
2 – Indexes which are “sparsely populated” as a result
of DML
3 – Knowledge of future index expansion
Some DBA's will use an empirical approach and rebuild
suspect indexes during scheduled downtime, measuring later changes or replying
upon first-hand end-users for feedback. Others are "ordered" to rebuild
regularly, based on superstition and bad advice.
While it's easy to measure the space savings of an index
rebuild, it's more challenging to correlate range-scan and full-scan SQL
before-and-after an index rebuild.
Some DBA's who have a high-DML system may use scripts such
as my
plan9i.sql script to identify range scan and full scan SQL statements.
If significant scans are identified, a representative DML and query-only
workload can be captured and used in a simple test.
-
In test, rebuild the target index
-
Run the query-only workload, measuring total logical
I/O (buffer touches)
-
Re-play the DML load/update/purge/jobs
-
Run the query-only workload, measuring total logical
I/O (buffer touches)
Again, every situation is unique, and this is only one of
dozens of ways that a DBA might approach this problem.
Reader Comments on index rebuilding:
It has been my experience that the percentage of deletes
required to affect performance actually goes down as the table gets larger,
perhaps because the I/O effect is magnified with more data.
At my previous job, our observation was that on 20 million
row tables and upward, as little as 5% deletes, regardless of other factors,
would cause sufficient performance degradation to justify an index rebuild.
Admittedly this was a site with fairly tight budget constraints, so we couldn't
afford absolute top of the line hardware and neither did we have the luxury of
making sure every tablespace was ideally placed for performance, but we did a
pretty good job with what we had. I actually wrote a script that would
calculate the % deletes and generate the index rebuild command.
Also, there is the rebuild online option for indices, which does work but it
will cause noticeable performance degradation if you try and do it during busy
times. I believe it was available in 8i, but it would generate ora-600 errors
when used on busy tables in 8i.
Andrew W. Kerber
Related index rebuilding notes
See my related notes on Oracle index rebuilding:
 |
If you like Oracle tuning, you
might enjoy my book "Oracle
Tuning: The Definitive Reference", with 950 pages of tuning tips and
scripts.
You can buy it direct from the publisher for 30%-off and get instant
access to the code depot of Oracle tuning scripts. |
|