Home
E-mail Us
Oracle Articles
New Oracle Articles

Oracle Training
Oracle Tips
Oracle Forum
Class Catalog

Remote DBA
Oracle Tuning
Emergency 911
RAC Support
Apps Support
Analysis
Design
Implementation
Oracle Support

SQL Tuning
Security
Oracle UNIX
Oracle Linux
Monitoring
Remote support
Remote plans
Remote services
Application Server
Applications
Oracle Forms
Oracle Portal
App Upgrades
SQL Server
Oracle Concepts
Software Support
Remote Support
Development
Implementation

Consulting Staff
Consulting Prices
Help Wanted!

Oracle Posters
Oracle Books
Oracle Scripts
Ion
Excel-DB

Don Burleson Blog

GC block lost wait event tips

RAC tuning tips

October 4, 2015

GC Block Lost Wait Event

No network is perfect. Data transmitted from point A to point B may occasionally get lost. The same is true for global cache transfers along the Cluster Interconnect. Global cache block transfers can get lost. If a requested block is not received by the instance in 0.5 seconds, the block is considered to be lost. When most block transfers complete in milliseconds, too many lost global cache block transfers can hamper application performance because the block needs to be re-sent, thus wasting time for the second transfer to complete.

Lost global cache block transfers can be seen in two different areas. Wait events named gc cr block lost and gc current block lost will be raised when a consistent read block transfer is lost, or when a current block transfer is lost, and the session must wait for the block to be resent. The other area is for the Oracle statistics named gc blocks lost as can be seen on the system or session level. Examples of these two metrics are seen below.

< gc_blocks_lost.sql

select

inst_id,

event,

total_waits,

time_waited

from

gv$system_event

where

event in ('gc current block lost',

'gc cr block lost')

order by

event,

inst_id;

INST_ID EVENT TOTAL_WAITS TIME_WAITED

---------- ------------------------------ ----------- -----------

1 gc cr block lost 50 3029

2 gc cr block lost 75 4516

1 gc current block lost 26 1467

2 gc current block lost 36 2060

select

sn.inst_id,

sn.name,

ss.value

from

gv$statname sn,

gv$sysstat ss

where

sn.inst_id = ss.inst_id

and

sn.statistic# = ss.statistic#

and

sn.name = 'gc blocks lost'

order by

sn.inst_id;

INST_ID NAME VALUE

---------- -------------------- ----------

1 gc blocks lost 90

2 gc blocks lost 164

The output above shows the metrics on a per-instance basis. One can certainly summarize the values across all instances if desired.

The presence of blocks lost in wait events or a system statistic is not sufficient to cause us great concern. Just like any network, there may be an occasional hiccup that would lead to lost block transfers and would appear in the gv$sysstat view. As with any wait event, the wait event metric by itself is essentially meaningless as there is no context from the output above. Is the wait event a 'top 5? wait event? Where the wait events generated over a 1-hour time period or 1 month? Since we do not know the answers to these questions, we cannot determine if the metrics are indicating a problem or not. More information is needed. An AWR report from a 1-hour snapshot of time can be more indicative that a real problem exists.

Top 5 Timed Foreground Events

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Avg

wait % DB

Event Waits Time(s) (ms) time Wait Class

-------------------------- ------------ ----------- ------ ------ ----------

DB CPU 6,975 32.1

db file sequential read 3,831,277 5,809 2 26.8 User I/O

gc current block lost 3,819 942 247 4.3 Cluster

db file parallel read 145,588 854 6 3.9 User I/O

gc cr multi block request 535,685 498 1 2.3 Cluster

Above, the gc current block lost wait event is in the Top 5 list. The listing above now provides context to the wait event in question. This event contributes the second longest total wait time for the instance during the one-hour time period. However, if the wait event were totally eliminated, only 4.3% of the total processing time would be recovered. From a performance tuning perspective, where the end goal is often to reduce processing time, it would be better to focus on the db file sequential read wait event that is contributing 26.8% of the total database time or determining if the CPU utilization can be decreased as that is contributing to 32.1% of the total time. That being said, it is never a good sign when any global cache blocks being lost are a top wait event.

The most common reason for lost global cache blocks is a faulty private network, i.e. one that is dropping packets. If global cache lost blocks are seen as a problem, then work with the network administrator to ensure the switch is valid, cables are secure and seated properly, firmware levels are up to date, and that other network configuration issues are not a problem. The network administrator should be able to use network tools like netstat and anything else in their arsenal to check for dropped packets on the private network.

[root@host01 ~]# netstat 'su

IcmpMsg:

InType0: 91

InType3: 723

InType8: 23

OutType0: 23

OutType3: 928

OutType8: 103

Udp:

664034038 packets received

983 packets to unknown port received.

20150 packet receive errors

654621700 packets sent

UdpLite:

IpExt:

InMcastPkts: 18041

OutMcastPkts: 8745

InBcastPkts: 102377

OutBcastPkts: 119

InOctets: 4678332299675

OutOctets: 2652878623355

InMcastOctets: 1401313

OutMcastOctets: 636504

InBcastOctets: 19312376

OutBcastOctets: 49090

The netstat utility is reporting UDP packet receive errors, indicating global cache lost block transfers for this node of the cluster. In addition to verifying the hardware is correct, the network administrator should investigate the following:

Private network is truly private

Oversaturated bandwidth due to too much traffic on the network

Quality of Service (QoS) settings that may be downgrading performance

Incorrect Jumbo Frames configuration

Multiple hops between the nodes and the private network switch

Mismatched MTU settings between devices

Mismatch in duplex mode settings between devices

Incorrect bonding/teaming configuration

If everything on the network side checks out, then look to sizing the UDP settings to have larger socket sizes as discussed in the previous section of this chapter. Global cache lost blocks are not always a network issue. After the network has been verified and UDP socket sizes are correct, look to see if CPU resources are in short supply.

Learn RAC Tuning Internals!

This is an excerpt from the landmark book Oracle RAC Performance tuning, a book that provides real world advice for resolving the most difficult RAC performance and tuning issues.

Buy it for 30% off directly from the publisher.

Hit Counter

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals. Feel free to ask questions on our Oracle forum.
Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.
Errata? Oracle technology is changing and we strive to update our BC Oracle support information. If you find an error or have a suggestion for improving our content, we would appreciate your feedback. Just e-mail:
and include the URL for the page.

Burleson Consulting

The Oracle of Database Support
Oracle Performance Tuning
Remote DBA Services

Copyright © 1996 - 2020

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

��