Tuning Oracle to minimize Crash
Sun Microsystems has published an online
whitepaper titled “Tuning ORACLE to Minimize Recovery Time: For
Solaris Operating System on SPARC” by Jim Mauro, a Senior Staff
Engineer in Sun's Performance and Availability Engineering group.
His paper focuses on tuning Oracle for fast recovery:
The paper includes these main areas for
speeding=up Oracle crash recovery:
ORACLE Cache Recovery Tuning
Recovery and Performance Measurements
Under best practices we see conclusions about
tuning Oracle crash recovery time:
Our findings are summarized as follows:
- The impact of tuning for recovery is
substantially minimized in later releases of ORACLE,
specifically ORACLE 9.2.
- In the ORACLE 8 tests, we used hard
("brute-force") checkpointing, and were able to reduce recovery
significantly, from about 27 minutes to about 4 minutes, but
that came at a application performance regression of 28%.
- Some quick tests done with incremental
checkpointing in ORACLE 8.1.7 yielded better recovery times,
down to 72 seconds, but caused a 26% application performance
- We purposely used a system that was
I/O-constrained, which predictably resulted in a negative
- Initial tests with ORACLE 9 on the
smaller system (system A) demonstrated excellent recovery times
(under 2 minutes) with a minimal performance impact (less than
5%). We addressed the I/O configuration that constrained the
ORACLE 8 tests through the use of more disks.
- The larger system (System B) with
ORACLE 9 demonstrated effective use of fast_start_mttr_target
with parallel_execution_message_size and recovery_parallelism.
Without tuning fast start, going from serial recovery (no
parallelism) to 24 recovery processes and increasing the message
buffer size from the default of 2 kilobytes to 4 kilobytes
yielded a 40% improvement in recovery time (from about 44
minutes to 26 minutes). Further gains were measured by
increasing the message buffer size to 16 kilobytes, where
recovery dropped to 22 minutes.
Using fast_start_mttr_target with message buffer tuning and
multiple recovery processes resulted in achieving recovery times
within the MTTR setting.