|
 |
|
Using multiple log writer (LGWR) processes
|
Oracle allows
busy DML databases to spawn multiple log writer processes, and the
Oracle docs show the various ways to implement multiple log writer
processes. The Oracle8 docs note that: “On multiple-CPU computers,
multiple redo copy latches allow multiple processes to copy entries
to the redo log buffer concurrently. The default value of
LOG_SIMULTANEOUS_COPIES is the number of CPUs available to your
Oracle instance”
“Prior to
Oracle8i you could configure multiple log writers using the
LGWR_IO_SLAVES parameter.”
In Oracle10g
lgwr_io_slaves becomes a hidden parameter (_lgwr_io_slaves).
Metalink note
109582.1 says:
“Starting with Oracle8, I/O slaves are provided. These slaves
can perform asynchronous I/O even if the underlying OS does not
support Asynchronous I/O. These slaves can be deployed by DBWR,
LGWR, ARCH and the process doing Backup. . .
In Oracle8i, the DBWR_IO_SLAVES parameter determines the number
of IO slaves for LGWR and ARCH. . .
As there may not be substantial log writing taking place, only
one LGWR IO slave has been started initially. This may change
when the activity increases.”
According to Metalink Note 109582.1, you will see multiple log writer slaves as
ix(nn) processes. The note is for Oracle 8, but it’s a good description,
confirming that multiple log writer processes exist.
"The representation of I/O Slaves is as follows: ora_ixnn_SID where:
i= slave,
x= adaptor number and nn is the slave number within the adaptor.
For example:
ora_i105_mul is a background I/O slave process.
Here:
- "i" stands for a slave - "1" stands for the adaptor number for this slave
- "05" is the slave number within the adaptor.
An adaptor is a pool of memory allocated to a thread and the pool is allocated
a handle to be identified. The adaptor number is the handle returned by the
internal Oracle code.
Thus a typical Unix ps -ef listing after instance startup may look like :
oracle 27582 1 0.0 11:16:40 ?? 0:44.03
ora_pmon_mul oracle 27584 1 0.0 11:16:40 ?? 0:28.44 ora_ckpt_mul oracle 27586 1 0.0 11:16:40 ?? 0:58.65 ora_dbw0_mul oracle 27588 1 0.0 11:16:40 ?? 0:05.12 ora_lgwr_mul oracle 27593 1 0.0 11:16:41 ?? 0:32.35 ora_smon_mul oracle 27594 1 0.0 11:16:41 ?? 0:11.59 ora_arc0_mul oracle 27596 1 0.0 11:16:41 ?? 0:00.35 ora_reco_mul oracle 27598 1 0.0 11:16:48 ?? 0:17.42 ora_i201_mul oracle 28408 1 0.0 11:35:25 ?? 0:06.47 ora_i102_mul oracle 28411 1 0.0 11:35:25 ?? 0:06.59 ora_i101_mul oracle 28412 1 0.0 11:35:25 ?? 0:06.10 ora_i103_mul oracle 28414 1 0.0 11:35:25 ?? 0:06.03 ora_i104_mul oracle 28416 1 0.0 11:35:25 ?? 0:06.14 ora_i105_mul oracle 28420 1 0.0 11:35:25 ?? 0:06.01 ora_i106_mul
Multiple log writers and parallelism
Metalink note
147471.1 “Tuning
the Redo log Buffer Cache and Resolving Redo Latch Contention”,
notes that multiple redo allocation latches become possible by
setting the parm _log_parallelism, and that the log buffer is split
in multiple LOG_PARALLELISM areas that each have a size of init.ora
LOG_BUFFER. Further, it shows the relationship to the number of
CPU’s:
“The number
of redo allocation latches is determined by init.ora
LOG_PARALLELISM. The redo allocation latch allocates space in
the log buffer cache for each transaction entry. If transactions
are small, or if there is only one CPU on the server, then the
redo allocation latch also copies the transaction data into the
log buffer cache.”
We also see that
log file parallel writes are related to the number of CPU’s.
Metalink note 34583.1 “WAITEVENT: "log file parallel write"
Reference Note”, shows that the log_buffer size is related to
parallel writes (i.e. the number of CPU’s), and discusses how LGWR
must wait until all parallel writes are complete. It notes that
solutions to high “log file parallel write” waits are directly
related to I/O speed, recommending that redo log members be on
high-speed disk, and that redo logs be segregated:
“on disks
with little/no IO activity from other sources.
(including low activity from other sources against the same disk
controller)”.
This is a strong
argument for using super-fast solid-state disk (SSD), if you have
already optimized your redo logs and disk I/O sub-system.
Mike Ault has
these notes on multiple log writer processes:
On an async_io capable system Oracle will use
async (i.e. multiple simultaneous IOs) to write all writes (dbwr,
lgwr, etc.). However, on non-async capable systems setting
dbwr_io_slaves to any value greater than 0 will result in up to 4
lgwr IO slaves.
There is another parameter, one of the parallel ones, that will also
set lgwr_io_slaves to 4, but I can't find the exact reference right
now.
That Oracle sets the parameter log_buffer based on the number of
CPUs seems to be a smoking gun but I cannot find documentation,
probably without going to the source code, at how the buffers are
written when multiple lgwr_io_slaves or async IO is used. Are they
used to write to mirrors? Or, are they used to split the write up so
the IO can be spread? I vote for the latter, but at this time have
no documentation to prove it one way or the other.
In 8.1.7 the range according to the Oracle
documentation (and also on
Metalink) the maximum default value of the LOG_BUFFER parameter is
512k or
128 KB * CPU_COUNT, whichever is greater.
However in 10g I believe
this has
been altered since the 10g instance on my single cpu system defaults
to 256k
and on my 9i instance it is 512k.To get multiple lgwr slaves, the
dbwr_io_slaves parameter must be set to greater than 0, it will not
be done
automatically. The maximum size is not bounded.
In the 9i/10g docs they got it right.
I cannot find any correlation between the two, but it is odd that
Oracle
would tie the value of log_buffer to cpu_count if some multiple
process
thing was not being used.
Normally on an async capable system, the writes will be done
asynchronously.
However, what is written either asynchronously or by the multiple
logwr io
slaves is not specified.
|