Oracle Tuning:
throughput vs. response time
Many people assume that the defaults for the Oracle
parameters are just-right for their systems, not realizing the incompatible
goals of optimizing for throughput vs. optimizing for fast response time.
These metrics are quite separate. For example,
when
evaluating storage devices we are concerned with the average response
time as the total device throughput.
The default in Oracle 11g for the
optimizer_mode is
all_rows, an optimizer mode that
has the goal of maximizing total throughout by choosing SQL access plans
that minimize server resources.
While minimizing computing resources is a worthy goal, optimizing for
maximum system throughput is not the same as optimizing for the fastest
response time, a goal set with
optimizer_mode=first_rows. These goals of response time vs.
throughput are quite different and often at odds with each other:
·
Fast Response
Time - This is a "selfish" goal; minimizing
response time for individual queries, regardless of the expense to other
tasks on the system. Fast
response time is often associated with index access. Response time is
measured as average elapsed seconds.
·
High
Throughput
- This is a holistic system-wide optimizer goal that is concerned with
optimizing your entire workload to maximize total throughput as a whole.
High throughput is often associated
with parallelized large-table full-table scans. Throughput is measured
as transactions per second.
To get the fastest response time we often need to use
the first_rows optimizer_mode, a
setting that directs the SQL optimizer to choose plans that start returning
rows back to the calling application as quickly as possible.
Optimizing for the fastest response time often involves using
indexes, an access technique that returns rows quickly but incurs more I/O
operations.
Here are some common Oracle
metrics for response time
and throughput:
Throughput Metrics |
Response Time Metrics |
- Block gets per second
- Block changes per transaction
- Enqueue deadlocks per second
|
- Transactions per second
- SQL rows per second
- Average SQL elapsed time
|
The
optimizer_mode is a "silver bullet" parameter, a single setting where
changes will have a profound impact on the entire database.
See here how
optimizer_mode
impacts overall workload performance.
For more on holistic tuning with silver bullets, see my book
Oracle
Silver Bullets.
Are optimizing
for fast response time and optimal throughput totally incompatible?
The answer lies with the utilization of your server resources.
In a perfect world with unlimited server resources, we could incur
the overhead of index processing, but that would not help "batch" jobs who
have a goal of manipulating zillions of rows, often using parallel query.
Optimizing an
Oracle instance for multiple workloads
In the real world, shops frequently have bi-modal
workloads, processing online transactions during the day and running batch
jobs at night. A "day mode"
would have parameter setting that enables fast response time, such as
optimizer_mode=first_rows.
A "night mode" would be optimized for throughput, adjusting the
system parameters to maximize parallelism and full scan activity, an
approach favored by
optimizer_mode=all_rows.
For these systems you can analyze past historical
workload trends and predict when a workload changes and what type of
configuration it requires. I
discuss a complete method for this in my book "Oracle
Tuning: The Definitive Reference", but in a nutshell, it involves
running historical queries against performance data from AWR or STATSPACK,
or by using trend analysis tools such as
Ion for Oracle.
Predicting
workloads with Ion
Those who forget the past are condemned to repeat it
Here are the steps to analyze workloads for a
self-tuning architecture:
1 - The first step in understand your Oracle workloads is
to analyze their characteristics, using the historical database provided in
STATSPACK or AWR.
2 - Once you have identified your workload and their goals
(response time or high throughout), you can use "alter system" commands to
adjust the configuration of your instance to accommodate the changes in
workload.
Because this approach is proactive, the system never
suffers a slow-down, and unlike Oracle automation tools like Automatic
Memory Management (AMM), which are reactive, and wait until a problem occurs
before changing the SGA pool sizes.