Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

Free Oracle Tips

HTML Text

 Home
 E-mail Us
 Oracle Articles



 Oracle Training
 Oracle News

 Oracle Forum
 Class Catalog


 Our Staff
 Our Prices
 Help Wanted!

 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 UNIX
 Oracle UNIX
 Linux
 Oracle Linux
 Monitoring
 Remote help

 Remote plans
 Remote
services
 Oracle C++
 Oracle Java
 Apache
 JDeveloper
 App Server

 Applications
 Oracle Forms
 Oracle Portal
 11i Upgrades
 SQL Server
 Oracle Concepts
 HTML-DB Tips
 Software Help

 Remote Help  
 Development  

 Implementation


 Financials Training
 Oracle 11i
 Oracle Apps 11i
 Oracle Workflow
 Oracle AR 11i Class
 Oracle AP 11i class
 Oracle GL 11i class
 Oracle HR 11i class
 Oracle FA 11i class
 11i Project Mgt
 11i procurement
 11i collections


 Oracle Posters
 Oracle Books

 Oracle Tuning Book
 Oracle RAC Book
 Oracle Security
 Easy Oracle Books
 Oracle Scripts
 SQL Server DBA
 SQL Design Patterns
 WISE
 Excel-DB   


 BC Oracle News


 Rednecks!
 Dress code
 Arabian Stallion

 Burleson Arabians
 Guide Horses
 Don Burleson Blog
 Golf & Travel


 Privacy Policy
 

 

 

 

 

Oracle Streams Benchmark Testing

Oracle Tips by Burleson Consulting

 

Benchmarking Oracle Streams

Oracle Streams is a popular free alternative to Oracle RAC for failover and disaster recovery, but it is important to note that Oracle Streams is not a panacea, and it’s not appropriate for all databases.  Oracle Streams has no licensing costs (RAC costs extra) and it is less complex to configure than a RAC database.  This disadvantage over RAC is that there is no continuous availability, update collisions may occur, replication is not instant, and application failover is usually done manually.

When using Oracle Streams for n-way replication it is critical to implode each instance with DML to determine the point where Oracle will experience contention.

The paper below describes a DML stress test on Oracle 10g Streams, using the Quest Benchmark factory software 

We also need to understand the limits of cross-replicating Oracle systems with Oracle Streams.  For more information on Streams vs. RAC for failover, see my notes here:

 

Oracle Streams Stress test Executive Summary

The purpose of this DML stress test was to implode an Oracle instance with insert transactions and observe the breaking-point as high-volumes of data replicate to the to a remote server using n-way Oracle Streams. 

This Oracle Streams stress test was a tremendous success and shows these exact areas where the database will experience contentions at high loads.   The test also revealed maximum rates (per server) for I/O, including disk reads, disk writes and maximum updates per second.

  • 400 writes per second
  • 7,000 disk reads per second
  • 12,000 updates per second
  • 19,000 logical buffer reads per second
  • UGA memory access at 18 million per second

We now know the areas where we will have future problems as system update activity increases:

  • We need to monitor the log switch interval and re-size the online redo logs if switches occur more frequently than every 15 minutes.
  • We need to monitor Streams spills to disk and increase the streams pool to ensure that we never have disk enqueues.
  • Hot tables may eventually have to be moved from ASSM to standard, multiple freelists.

 

Oracle Streams Stress Test Overview

The subject of this test is an Oracle server that is using Oracle10g 2-way Streams replication across a 90-mile high-speed interconnect.  Figure 1 shows a model of the Oracle server architecture:

Figure 1 – The Oracle server architecture

This test imploded the Oracle instance with insert statements in order to locate the system bottleneck, focusing on the ability of Oracle to replicate the data across the Oracle Streams interface to the remote server.  The default block size is 8k with a special 32k tablespace for the indexes.

This test used the Quest Benchmark Factory to generate the insert statements, not using bind variables.

 

Predictions

As DML transactions increase we would expect the stress to increase on those instance components associated with DML:

  • Redo log contention – We expect to see very-frequent log switches and waits on redo
  • Streams replication enqueues – We expect to see the Oracle Streams back-up and enqueue (to disk).
  • Object contention – As inert rates increase we expect to note segment contention on the bitmap freelists (this test has tablespaces with ASSM).
  • RAM exhaustion – We expect to see contention within the shared_pool and db_cache_size.

Streams Stress Experiment Results

This test was a tremendous success and shows these exact areas where the database will experience contentions at high loads and also reveals maximum rates for I/O, including disk reads, disk writes and maximum updates per second.

The AWR report from the stress test is in Appendix A.  We noted excellent rates for data block access:

  • 400 writes per second
  • 7,000 disk reads per second
  • 12,000 updates per second
  • 19,000 logical buffer reads per second
  • UGA memory access at 18 million per second

 

Instance Activity Stats                  DB/Inst: CATGT/catgt  Snaps: 911-912
 
Statistic                                     Total     per Second     per Trans
-------------------------------- ------------------ -------------- -------------
db block changes                          1,888,093       11,989.6          26.0
db block gets                             2,697,885       17,131.8          37.2
physical read bytes                       1,122,304        7,126.7          15.5
physical read total IO requests               3,451           21.9           0.1
physical read total bytes               610,102,272    3,874,206.4       8,410.8
physical write IO requests                   57,900          367.7           0.8
physical write bytes                    528,932,864    3,358,773.1       7,291.8
physical write total IO requests             73,400          466.1           1.0
physical write total bytes            1,864,753,152   11,841,356.6      25,707.3
session uga memory                   12,884,963,888   81,820,723.5     177,630.5
 

 

During the heavy load on the system, we observed that the redo and Steams became system bottlenecks:

 
Top 5 Timed Events                                         Avg %Total
~~~~~~~~~~~~~~~~~~                                        wait   Call
Event                                 Waits    Time (s)   (ms)   Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
log file sync                        73,876       1,977     27   44.3     Commit
enq: HW - contention                    554         874   1578   19.6 Configurat
enq: SQ - contention                  3,169         680    215   15.3 Configurat
CPU time                                            235           5.3
latch: library cache                  8,022         172     21    3.9 Concurrenc
          -------------------------------------------------------------

Redo Log contention

For this load level the size of the redo logs is insufficient, at the very start we were doing 197 switches per hour at the end 205.

 
Statistic                                     Total  per Hour
-------------------------------- ------------------ ---------
log switches (derived)                            9    205.74

For recoverability we want to keep the redo size at the default value (512k) so that we can get a few log switches per day.  If we make them too large we risk loosing multiple days work.

Object Contention

 

Our biggest stress point for blocks appears to be row lock waits on the indexes. This was probably because we were doing a high volume of updates along with the inserts, in normal operation we probably won't see this mix at this stress level. We used ASSM and noted issues with the initrans value.  We may want to switch to manual for some tables if we expect this level of stress.

At max stress we were also seeing sequence and object high watermark enqueues. The sequence enqueues probably indicate we need to increase the caching level for the sequences (I'll wager they are at the default of 20, we may wish to bump that up to 100 or so). The HWM enqueues are because we started expanding the table quickly...we may want to look at the tablespace settings for extent size.

Enqueue Type (Request Reason)
------------------------------------------------------------------------------
    Requests    Succ Gets Failed Gets       Waits  Wt Time (s) Av Wt Time(ms)
------------ ------------ ----------- ----------- ------------ --------------
SQ-Sequence Cache
       6,671        6,660           0       3,050          631         207.00
TX-Transaction (index contention)
       1,589        1,588           0       1,565            8           5.35

We also note that the bitmap freelists (as implemented via Automatic Segment Storage Management – ASSM) become overloaded, resulting in buffer busy waits:

                                                                  Buffer
           Tablespace                      Subobject  Obj.          Busy    % of
Owner         Name    Object Name            Name     Type         Waits Capture
---------- ---------- -------------------- ---------- ----- ------------ -------
CAT_SCHEMA CAT_SCHEMA XPKIM_DONALDS                   INDEX        2,768   73.85

 

Streams Contention

As predicted, we noted enqueue contention within Oracle Streams as transactions backed-up to be replicated to the remote server:

                                                                   Avg
                                             %Time  Total Wait    wait     Waits
Event                                 Waits  -outs    Time (s)    (ms)      /txn
---------------------------- -------------- ------ ----------- ------- ---------
enq: HW - contention                    554   47.8         874    1578       0.0
Streams capture: waiting for             85   50.6         112    1321       0.0
Streams AQ: waiting for mess             91   96.7         330    3627       0.0
virtual circuit status                    6  100.0         176   29294       0.0
Streams AQ: waiting for time              7   42.9         166   23752       0.0
Streams AQ: qmn coordinator              78   37.2         161    2067       0.0

 

At this level of activity streams is spilling, this is causing it to not keep up as well as it should (spilling means going to disk) we need a larger streams pool. At maximum (according to the report) we will need about a 400 megabyte streams pool. We may also want to look at multiple streams processes.

 
Streams Pool Advisory                          DB/Inst: CATGT/catgt  Snap: 912
 
  Size for      Size   Est Spill   Est Spill Est Unspill Est Unspill
  Est (MB)    Factor       Count    Time (s)       Count    Time (s)
---------- --------- ----------- ----------- ----------- -----------
        48       0.2   2,133,680         658     394,800         108
 
        80       0.4   1,215,434         382     169,076          46
       112       0.5     581,667         192      49,401          13
       144       0.7     146,351          45      26,205           7
       176       0.8      50,491          15      15,787           4
       208       1.0      27,981           8      10,845           3
       240       1.2      14,558           4       7,167           2
       272       1.3       5,839           1       3,506           1
       304       1.5       1,390           0       1,260           0
       336       1.6         126           0         701           0
          -------------------------------------------------------------

We also see high CPU activity associated with Streams during the high update period 

Streams CPU/IO Usage                     DB/Inst: CATGT/catgt  Snaps: 911-912
-> Streams processes ordered by CPU usage
-> CPU and I/O Time in micro seconds
 
Session Type                    CPU Time  User I/O Time   Sys I/O Time
------------------------- -------------- -------------- --------------
STREAMS Apply Reader           6,099,450         20,501         13,413
Logminer Reader                  970,096              0      8,876,876

 

Lessons Learned
 

  • Streams pool size and multiple streams processes (see Mike for advice)
  • Increase sequence caching to 100
  • Review extent sizes for the most active tables (fbi_donald, fbi_establishment)

We now know what will happen as the load expands and we can prepare for future high-stress conditions:

  • We need to monitor the log switch interval and re-size the online redo logs if switches occur more frequently than every 15 minutes.
  • We need to monitor Streams spills to disk and increase the streams pool to ensure that we never have disk enqueues.
  • Hot tables (fbi_donald) may eventually have to be moved from ASSM to standard, multiple freelists.

Appendix A - AWR Report

 

WORKLOAD REPOSITORY report for
 
DB Name         DB Id    Instance     Inst Num Release     RAC Host
------------ ----------- ------------ -------- ----------- --- ------------
CATGT         2438152563 catgt               1 10.2.0.1.0  NO  catgmis-gt.g
 
              Snap Id      Snap Time      Sessions Curs/Sess
            --------- ------------------- -------- ---------
Begin Snap:       911 16-Mar-06 22:30:32        81       7.7
  End Snap:       912 16-Mar-06 22:33:10        80       7.7
   Elapsed:                2.62 (mins)
   DB Time:               74.32 (mins)
 
Cache Sizes
~~~~~~~~~~~                       Begin        End
                             ---------- ----------
               Buffer Cache:     3,376M     3,376M  Std Block Size:         8K
           Shared Pool Size:       624M       624M      Log Buffer:    14,400K
 
Load Profile
~~~~~~~~~~~~                            Per Second       Per Transaction
                                   ---------------       ---------------
                  Redo size:          2,292,249.69              4,976.41
              Logical reads:             19,080.45                 41.42
              Block changes:             11,989.57                 26.03
             Physical reads:                  0.70                  0.00
            Physical writes:                409.84                  0.89
                 User calls:              1,252.97                  2.72
                     Parses:                739.47                  1.61
                Hard parses:                  5.81                  0.01
                      Sorts:                  2.37                  0.01
                     Logons:                  0.04                  0.00
                   Executes:                943.47                  2.05
               Transactions:                460.62
 
  % Blocks changed per Read:   62.84    Recursive Call %:    64.98
 Rollback per transaction %:    0.03       Rows per Sort:    36.06
 
Instance Efficiency Percentages (Target 100%)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
            Buffer Nowait %:   99.86       Redo NoWait %:   99.96
            Buffer  Hit   %:  100.00    In-memory Sort %:  100.00
            Library Hit   %:   99.80        Soft Parse %:   99.21
         Execute to Parse %:   21.62         Latch Hit %:   99.83
Parse CPU to Parse Elapsd %:   21.32     % Non-Parse CPU:   91.63
 
 Shared Pool Statistics        Begin    End
                              ------  ------
             Memory Usage %:   72.94   81.20
    % SQL with executions>1:   97.06   82.84
  % Memory for SQL w/exec>1:   94.55   79.61
 
Top 5 Timed Events                                         Avg %Total
~~~~~~~~~~~~~~~~~~                                        wait   Call
Event                                 Waits    Time (s)   (ms)   Time Wait Class
------------------------------ ------------ ----------- ------ ------ ----------
log file sync                        73,876       1,977     27   44.3     Commit
enq: HW - contention                    554         874   1578   19.6 Configurat
enq: SQ - contention                  3,169         680    215   15.3 Configurat
CPU time                                            235           5.3
latch: library cache                  8,022         172     21    3.9 Concurrenc
          -------------------------------------------------------------
 
Time Model Statistics                    DB/Inst: CATGT/catgt  Snaps: 911-912
-> Total time in database user-calls (DB Time): 4459.1s
-> Statistics including the word "background" measure background process
   time, and so do not contribute to the DB time statistic
-> Ordered by % or DB time desc, Statistic name
 
Statistic Name                                       Time (s) % of DB Time
------------------------------------------ ------------------ ------------
sql execute elapsed time                              2,152.0         48.3
sequence load elapsed time                              748.3         16.8
DB CPU                                                  234.6          5.3
PL/SQL execution elapsed time                           177.9          4.0
parse time elapsed                                       91.7          2.1
hard parse elapsed time                                   7.1           .2
repeated bind elapsed time                                0.1           .0
PL/SQL compilation elapsed time                           0.1           .0
failed parse elapsed time                                 0.0           .0
hard parse (sharing criteria) elapsed time                0.0           .0
connection management call elapsed time                   0.0           .0
hard parse (bind mismatch) elapsed time                   0.0           .0
DB time                                               4,459.1          N/A
background elapsed time                                 235.9          N/A
background cpu time                                      11.3          N/A
          -------------------------------------------------------------
 
Wait Class                                DB/Inst: CATGT/catgt  Snaps: 911-912
-> s  - second
-> cs - centisecond -     100th of a second
-> ms - millisecond -    1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc
 
                                                                  Avg
                                       %Time       Total Wait    wait     Waits
Wait Class                      Waits  -outs         Time (s)    (ms)      /txn
-------------------- ---------------- ------ ---------------- ------- ---------
Commit                         73,876     .1            1,977      27       1.0
Configuration                   4,220   11.9            1,773     420       0.1
Concurrency                    16,589     .8              370      22       0.2
Network                       117,711     .0              147       1       1.6
System I/O                     14,444     .0              124       9       0.2
Other                           5,945    2.5               77      13       0.1
Application                     2,478     .0                4       1       0.0
User I/O                          567     .0                1       2       0.0
          -------------------------------------------------------------
 
Wait Events                              DB/Inst: CATGT/catgt  Snaps: 911-912
-> s  - second
-> cs - centisecond -     100th of a second
-> ms - millisecond -    1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)
 
                                                                   Avg
                                             %Time  Total Wait    wait     Waits
Event                                 Waits  -outs    Time (s)    (ms)      /txn
---------------------------- -------------- ------ ----------- ------- ---------
log file sync                        73,876     .1       1,977      27       1.0
enq: HW - contention                    554   47.8         874    1578       0.0
enq: SQ - contention                  3,169    3.7         680     215       0.0
latch: library cache                  8,022     .0         172      21       0.1
buffer busy waits                     4,316    2.9         169      39       0.1
SQL*Net more data to dblink          18,425     .0         146       8       0.3
Streams capture: waiting for             85   50.6         112    1321       0.0
log file parallel write              10,451     .0          99       9       0.1
log file switch (checkpoint             181   39.2          97     535       0.0
rdbms ipc reply                         339    5.9          48     143       0.0
control file parallel write             359     .0          13      35       0.0
latch free                            1,463     .0          10       7       0.0
log file sequential read                521     .0          10      19       0.0
latch: enqueue hash chains            1,312     .0           9       7       0.0
latch: library cache pin                958     .0           8       9       0.0
log file switch completion              187    2.7           8      43       0.0
enq: TX - index contention            1,776     .0           8       4       0.0
latch: cache buffers chains           1,214     .0           7       6       0.0
buffer deadlock                         104  100.0           5      46       0.0
latch: shared pool                      207     .0           4      21       0.0
SQL*Net break/reset to clien          2,478     .0           4       1       0.0
Log archive I/O                         431     .0           2       6       0.0
LGWR wait for redo copy               2,078    1.3           2       1       0.0
latch: session allocation               264     .0           1       6       0.0
Data file init write                    257     .0           1       4       0.0
latch: redo copy                         38     .0           1      28       0.0
latch: library cache lock                71     .0           1      11       0.0
SQL*Net message to client            99,091     .0           1       0       1.4
latch: cache buffers lru cha             66     .0           1       9       0.0
enq: US - contention                     58     .0           0       8       0.0
row cache lock                            5     .0           0      83       0.0
os thread startup                         3     .0           0     109       0.0
latch: redo allocation                  166     .0           0       2       0.0
enq: CF - contention                      7     .0           0      36       0.0
latch: object queue header o             36     .0           0       6       0.0
cursor: mutex S                           9     .0           0      20       0.0
db file single write                     31     .0           0       6       0.0
db file sequential read                  75     .0           0       1       0.0
latch: undo global data                  43     .0           0       2       0.0
control file sequential read          2,646     .0           0       0       0.0
latch: row cache objects                  8     .0           0       5       0.0
SQL*Net more data to client             109     .0           0       0       0.0
wait list latch free                      2     .0           0      11       0.0
log file single write                    36     .0           0       1       0.0
enq: TX - contention                      2     .0           0       7       0.0
undo segment extension                    1  100.0           0      13       0.0
enq: TX - allocate ITL entry              2     .0           0       6       0.0
enq: FB - contention                      5     .0           0       2       0.0
latch: redo writing                       3     .0           0       0       0.0
direct path read                         99     .0           0       0       0.0
direct path write                       103     .0           0       0       0.0
SQL*Net message to dblink                72     .0           0       0       0.0
db file scattered read                    2     .0           0       0       0.0
SQL*Net more data from clien             14     .0           0       0       0.0
SQL*Net message from client          99,093     .0       4,468      45       1.4
Streams AQ: waiting for mess             91   96.7         330    3627       0.0
virtual circuit status                    6  100.0         176   29294       0.0
Streams AQ: waiting for time              7   42.9         166   23752       0.0
Streams AQ: qmn coordinator              78   37.2         161    2067       0.0
LogMiner: wakeup event for p            877   20.0         161     183       0.0

 
Wait Events                              DB/Inst: CATGT/catgt  Snaps: 911-912
-> s  - second
-> cs - centisecond -     100th of a second
-> ms - millisecond -    1000th of a second
-> us - microsecond - 1000000th of a second
-> ordered by wait time desc, waits desc (idle events last)
 
                                                                   Avg
                                             %Time  Total Wait    wait     Waits
Event                                 Waits  -outs    Time (s)    (ms)      /txn
---------------------------- -------------- ------ ----------- ------- ---------
Streams AQ: qmn slave idle w             48     .0         159    3318       0.0
LogMiner: wakeup event for b         22,391     .4         157       7       0.3
wait for unread message on b         52,923     .2         157       3       0.7
Streams AQ: delete acknowled             42   50.0         156    3725       0.0
LogMiner: wakeup event for r          1,312    8.8         151     115       0.0
jobq slave wait                          47  100.0         138    2931       0.0
SQL*Net message from dblink              72     .0          15     207       0.0
LogMiner: client waiting for              1     .0           0       0       0.0
          -------------------------------------------------------------
 
Background Wait Events                   DB/Inst: CATGT/catgt  Snaps: 911-912