Every DBA knows that the size of their redo logs
is very important. Too small and the frequent log switches tie-up
the LGWR, ARCH and DBWR background processes. Too large and you
risk losing data during an instance crash.
Oracle 10g introduced a new advisory
utility that allows you to specify your optimal mean time to
recovery (MTTR) recovery interval and uses this to suggest the
optimal redo log size. In Oracle 10g the fast_start_mttr_target
parameter is used.
Oracle recommends using the
fast_start_mttr_target
initialization parameter to
control the duration of startup after instance failure. With 10g,
the Oracle database can now self-tune check-pointing to achieve good
recovery times with low impact on normal throughput. You no longer
have to set any checkpoint-related parameters.
This method reduces the time required
for crash recovery and makes the recovery bounded and predictable by
limiting the number of dirty buffers and the number of redo records
generated between the most recent redo record and the last
checkpoint. Administrators specify a target (bounded) time to
complete the cache recovery phase of recovery with the
fast_start_mttr_target
initialization parameter, and
Oracle automatically varies the incremental checkpoint writes to
meet that target.
The target_mttr
field of v$instance_recovery contains the MTTR target in effect. The estimated_mttr field of v$instance_recovery contains the estimated MTTR
should a crash happen right away.
For example,
SELECT
TARGET_MTTR,
ESTIMATED_MTTR,
CKPT_BLOCK_WRITES
FROM
V$INSTANCE_RECOVERY;
TARGET_MTTR
ESTIMATED_MTTR CKPT_BLOCK_WRITES
-----------
-------------- -----------------
37 22 209187
Whenever you set
fast_start_mttr_target to a nonzero value, and while MTTR
advisory is ON, Oracle Corporation recommends that you disable (set
to 0) the following parameters:
If your database is relatively
stable, then you can use this suggested size and rebuild your online
redo log files to match the value. We would expect that a future
version of Oracle will automate this and allow for dynamic re-sizing
of online redo log files, but this is an issues because many Oracle
systems expect the archived redo log files to always be the same
size.
The sizing redo log files can
influence performance because DBWR, LGWR and ARCH are all
working during high DML periods.
A too small online redo log file size can cause
slowdowns from excessive DBWR and checkpointing behavior.
A high checkpointing frequency and the "log file switch
(checkpoint incomplete) can cause slowdowns.
Also see my notes on
fixing frequent log
switches