Important 2007 update:
The rules for identification of candidates for index rebuilding are
changing. Please see my
updated notes on index rebuilding.
Reliably predicting when to
reorganize your tables and indexes is an important decision, and one
that you can be the first to automate. This has become the ?Holy
Grail? for many Oracle professionals, with
numerous articles and
debates about predicting the benefit from a reorg. Some purists
say that it's better to design the table/indexes never to fragment
in the first place, but the reality of the Oracle DBA is that high
attrition, vendor packages and the inability to control the
developers makes this a very important real-world problem.
It is clear that Oracle will soon
leverage upon their wonderful time-series performance repository,
the AWR, and create predictive models:
-
In an OracleWorld 2003
presentation titled ?Oracle Database 10g: The Self-Managing
Database? by Sushil Kumar of Oracle Corporation, Kumar
states that the new Automatic Maintenance Tasks (AMT) Oracle10g
feature will "automatically detect and re-build sub-optimal
indexes.?
-
In a paper titled ?Metric
Baselines: detecting and explaining performance events in em
10gr2? (Presented at the
RMOUG 2005
Training Days), John Beresniewicz of Oracle Corporation notes
that the use of "baselines to capture and adapt thresholds to
expected time-dependent workload variations" is a core
feature of the next release of Oracle.
The Objective:
The goal is to create a reliable
predictive model that will suggest tables and indexes which will
measurably benefit from reorganization, predict the reduction in I/O
(logical I/O - consistent gets and physical I/O - physical
reads) after the reorganization, and suggest changes that will
prevent a reoccurrence of the fragmentation (i.e. new pctfree,
new blocksize, etc.):
-
Oracle tables/indexes can be
reorganized, sometimes resulting in faster SQL execution speeds
-
AWR (STATSPACK) has a history
of how the tables were accessed by historical SQL (dba_hist_sql_plan,
stats$sql_plan, etc.), including the I/O and CPU costs
associated with each step of SQL execution.
-
We can see the current
internal structure of every table/index (e.g. chain_cnt,
clustering_factor, etc.)
-
Most databases have repeating
patterns of behavior and historical SQL is usually
representative of future activity. The model should be able to
see if the table/index history has simply by examining the
historical access patterns.
The steps to solving this problem
are simple, but the process logic is complex:
1 - Gather table/index
structure information - There are
many examples
of routines that can perform a alter index xxx validate
structure command and store the results in a table for
analysis.
2 - Analyze historical
patterns of usage - We can alter the
plan.sql
script to access the stats$sql_plan table and see exactly
how table/indexes are accessed by our SQL, and how often too:
3 - Add the decision rules - Design an expert-system
algorithm that will interrogate your data from steps 1 and 2 and
create the list of tables/indexes, predicted I/O reductions, and
suggested structural changes.
4 - Test and Refine your
rules - Iterate through step 3, successively refining your
rules, based on actual results.
Developing your
rule-set
But how can we find these rules?
Do the rules have to make-sense? No, they don't! Scientists
commonly create models (The Oracle data mining tools) that scan
historical data and identify statistically significant correlations
(within 2 standard deviations of the mean value). For example, the
popular MMPI
test is a set of 500 true/false questions that accesses personality
with remarkable validity, and it's results are accepted in all U.S.
courts. Their test-base consists of hundreds of thousands of
subjects,. with a pre-diagnosed mental disorder (see
DSM IV).
By comparing their responses to seemingly innocuous questions (e.g.
"I read the editorials in the newspaper every day") a proven
predictive model was created (Federal
courts have affirmed the MMPI as a scientifically valid) and
accepted procedure for personality assessment.
For example, the subjects
preference to take showers vs. baths is an extremely reliable
measure of self-esteem. Do we know why? No. Do we care? Not
really. All that is proven is that this correlation is a
statistically reliable predictor of feelings of self-worth. We see
the exact same scientific principle applied to Oracle data mining
(ODM) tools. For example, we might find-out that people with red
hair buy a disproportionate amount of skin care products. Knowing
"why" is not important. What's important is knowing that the data
supports the assertion. Also useful is the book "Unobtrusive
Measures", which shows creative techniques for finding "hidden"
significant metrics.
In sum, rules don't have to be
proven true to be statistically reliable, and exceptions do not make
the rule invalid. For example, if two out of every 1,000
read-haired people don't buy skin care products, we still have a
model with a 99% predictive quality.
Some possible Rules for your
AI engine:
To get you started, here is a list
of possible conditions that may contribute to table/index
fragmentation. Remember, you don't need to pre-justify your rule
set. The only thing that counts is that your rules make accurate
predictions.
-
Index fast full scans will run faster after
index reorganization whenever the ?density? of the index entries
becomes greater. In other words, it takes less time to read
100,000 entries from a 100 block index than reading the entries
from a 500 block index.
-
Multi-block Index scans (INDEX_FFS), will run faster
when the table data blocks are arranged in index-key order and when
the data blocks have a high number of row entries (as evidenced
by clustering_factor in dba_indexes).
-
Large-table full-table scans may run faster
after table reorganization, especially when the table has excessive chained or
relocated rows, or low block density after massive DML activity (updates
and deletes).
Proving the Benefit:
?
Full-table scans - The predictive model will
have a rate, expressed as a function of the number of rows and the
amount of I/O (both logical and physical) that are required to
access the rows via a ?db file scattered read? (Usually an index
fast-full scan, a full-table scan, etc.).
CONSISTENT GETS
PHYSICAL READS
DATE TABLE BLOCKS
PER
1,000 ROWS PER 1,000 ROWS
6/20/05 CUSTOMER 100,000 56
72
-- TABLE REORGED ON 6/20/05
*********************************** --
6/30/05 CUSTOMER 99,000 20
11
7/30/05 CUSTOMER 99,000 35
32
8/30/05 CUSTOMER
99,000 49 55
The
report output should show the forecasted reduction in I/O after the
table was reorganized.
?
Multi-block Index range scans - The predictive
model should be able to see the size of each index range scan (the
number of rows returned), the number of blocks that were required to
fetch the rows, and the number of logical and physical I/O.
?
Index Rebuild - The predictive model should
capture the ?speed? of index fast-full scans, multi-block index
range scans, and show the degradation over time.
Finding the Data:
The Oracle 10g AWR (and STATSPACK
in previous releases) provides a time-series database. This
time-series performance details can be used to create accurate
predictions of the real benefit of reorganizing tables and indexes.
Specifically, the AWR can measure, between any two snapshot values:
1 - The change in the speed of
SQL execution steps over-time.
In
this prototype below, we see the change in the speed of full-table
scans: