Home
E-mail Us
Oracle Articles
New Oracle Articles

Oracle Training
Oracle Tips
Oracle Forum
Class Catalog

Remote DBA
Oracle Tuning
Emergency 911
RAC Support
Apps Support
Analysis
Design
Implementation
Oracle Support

SQL Tuning
Security
Oracle UNIX
Oracle Linux
Monitoring
Remote support
Remote plans
Remote services
Application Server
Applications
Oracle Forms
Oracle Portal
App Upgrades
SQL Server
Oracle Concepts
Software Support
Remote Support
Development
Implementation

Consulting Staff
Consulting Prices
Help Wanted!

Oracle Posters
Oracle Books
Oracle Scripts
Ion
Excel-DB

Don Burleson Blog

The mysteries of Oracle buffer block management

by Donald Burleson

The internal machinations of Oracle buffer cache management are not published and many Oracle professionals infer their behavior from experimentation and tidbits from the Oracle documentation. For a complete description of the v$ event views, get the "free 11g poster" by UNISYS.

The v$bh view of x$bh has the all-important "status" column that indicates the lock mode for each data block in the data buffer. You can use v$bh to display the data buffer contents, and here is a great script for this. For non-OPS and non-RAC databases, you will see these common lock modes:

xcur - This is a RAM block that has been acquired in exclusive current mode. According the Oracle documentation, if a buffer state is exclusive current (XCUR), an instance owns the resource in exclusive mode.
cr - This mode indicates a "cloned" RAM block (a "stale" block), that was once in xcur mode. The instance has shared access to the block and can only perform reads. The cr state means the owning instance can perform a consistent read of the block, if the instance holds an older version of the data.
free - This is an "available" RAM block. It might contain data, but it is not currently in-use by Oracle.
read - The buffer is reserved for a block that is currently being read from disk.

During database recovery we might also see these values for v$bh status column:

mrec - Indicates a block in media recovery mode
irec - This is a block in instance (crash) recovery mode

For RAC database we also see this v$bh.status values:

scur - a current mode block, shared with other instances

A normal database will have filled all the buffer blocks in just a few minutes, and the database buffer will normally remain full until shutdown time. These "free" RAM blocks still contain data. They are just marked as free because they are eligible to be overlaid by a new incoming data block.

Multi-versioning of blocks in the buffer

The Oracle row-level locking mechanism is well understood by Oracle experts, but here is some question about how row-level locking translated into data buffer block status changes.

In a MOSC forum discussion titled "Multiple buffer versions flooding the buffer cache", we see a DBA on Oracle 8.1.6.2 who claims to have 50 versions of a single data block in his data buffer. This query was used to identify multiple versions of a data block within the buffer cache:

SELECT
   dbarfil,
   dbablk,
   class,
   count(*)
FROM
   x$bh
GROUP BY
   dbarfil,
   dbablk,
   class
HAVING
   count(*) > 40;

And also a query to identify the specific segments that are experiencing multiple data buffer versions:

SELECT
   segment_name,
   segment_type,
   owner,
   tablespace_name
FROM
   sys.dba_extents
WHERE
   file_id = <DBAFIL>
AND
   <DBABLK> BETWEEN block_id and block_id + blocks -1;

"If this is a table or index block it is advisable to check the application to see why the block is being frequently changed as this may represent a point of contention."

"In general, you would see a number of CR blocks in the buffer cache proportional to the number of updates on that block."

In this case we see an argument for re-setting the pctfree object parameter when we have a table (or index) with many "hot update" rows. Re-setting pctfree to a high-value will spread the data rows across more data blocks, thereby relieving the possibility of having dozens of versions of the data block in the data buffer.

For example, if you have avg_row_len = 80 and 2k data blocks, setting pctfree=90 would remove the data block from the freelist (marking it as logically full) after only three rows are inserted. This approach wastes disk space but reduced the multiple versions of "cr" RAM blocks. This high pctfree technique is also used to reduce row fragmentation in tables with VARCHAR data types where a row is initially stored small with huge row expansion from subsequent UPDATE statements.

Mark Bobak, a member of the respected Oaktable group, professes some knowledge of buffer management internals, and published this excellent speculation on the behavior of buffer block status:

I'm fairly certain, but not 100% sure, that Oracle creates a CR block from a XCUR (or SCUR? I'm not a RAC expert), by cloning the XCUR block, and then referring to rollback, rolls back the block till it's sufficiently old, to be read consistent to the point in time required.

So, if a query's snapshot SCN is N, but a particular DBA (data block address) is only available in current mode, and the SCN is N+100, then Oracle will clone it, and mark it as CR, and start rolling back, using the information it the block's ITL slot(s) to backtrack where the various before images are.

Eventually, one of two things will happen, either the block will be sufficiently old (with an SCN of N or less) or, Oracle will not be able to find the required information in the rollback segment, which would result in ORA-1555.

And yes, I believe the XCUR block is kept in the buffer cache after a transaction completes. (Again, there may be some gotchas in the RAC case, I'm not sure about that case.)

Consider also, that a block in XCUR mode can be flushed to disk, even if there is a transaction open on the block. Oracle doesn't care. If you think about how the locking model works, and how read consistency works, it really doesn't matter. If a block with one or more row level locks is flushed to disk, and then the transaction commits, what happens?

Well, Oracle certainly won't re-read the block just to update the row lock and ITL slot. So, what does it do? It marks the transaction as committed in the rollback segment slot header, and moves on. If another process comes along and reads or updates that block, delayed block cleanout will kick in to clean up the mess that was left behind.

There is also a question about buffer multi-versioning when using Oracle 10g Automatic Storage Memory Management (ASMM). Here is an actual example from an ASMM 10g database showing only one free block in the data buffer:

STATUS NUMBER_BUFFERS

------- --------------

cr 616

free 1

xcur 14790

Top 5 Timed Events

~~~~~~~~~~~~~~~~~~ % Total

Event Waits Time (s) DB Time Wait Class

------------------------------ ------------ ----------- --------- --------------

log file parallel write 9,670 291 55.67 System I/O

log file sync 9,293 278 53.12 Commit

CPU time 225 43.12

db file parallel write 4,922 201 38.53 System I/O

control file parallel write 1,282 65 12.42 System I/O

Here is a sample AWR report from the system as BC first found it.

Note that all xcur and cr blocks in the RAM data buffers can be de-allocated (with the alter system flush buffer_pool command) to make their status "free" and they can then be reassigned to other SGA regions. Here is a simple script to display counts of each v$bh status:

column c1 heading 'Status' format a10

column c2 heading 'Number|of Data|Buffers' format 999,999,999

select

status c1,

count(1) c2

from

v$bh

group by

status

order by

count(1) desc;

Here is the output from a high-DML database.

Number

of Data

Status Buffers

---------- ------------

xcur 311,967

free 270,731

cr 17,302

The number of blocks in v$bh where status="free" is critical to high-update databases because there must be enough buffers to hold all data blocks that are being updated (in xcur status). The Oracle 10g Automatic Storage Memory Management (ASMM) feature (the default on Oracle10g) should anticipate high updates and allocate additional data buffers during high update periods. For example, here is an actual output from an Oracle10g database where it appears that ASMM is not allocating enough free blocks to accommodate concurrent updates:

STATUS NUMBER_BUFFERS

------- --------------

cr 616

free 1

xcur 14790

For more on ASMM, click here.

An Online Demonstration

This example below shows how Oracle marks un-locked block buffers as "free" and how non-free buffer blocks (where status <> 0) and we see the cur and xcur blocks being flushed from the buffer cache when we issue the alter system flush buffer_cache command. In this example there are 50,000 rows in the POLICYREC table.

SQL> update POLICYREC set sum_assured = sum_assured + 15;

50000 rows updated.

SQL> commit;

Commit complete.

SQL> select o.OBJECT_TYPE, substr(o.OBJECT_NAME,1,10)
objname , b.objd , b.status, count(b.objd) from v$bh b,
dba_objects o where b.objd = o.data_object_id and
o.owner = 'NYUSER' group by o.object_type,
o.object_name,b.objd, b.status ;

OBJECT_TYPE OBJNAME OBJD STATUS COUNT(B.OBJD)

----------- ------------- ------ ------ -------

TABLE TEST1 43058 free 6

TABLE POLICYREC 43061 cr 47

TABLE POLICYREC 43061 free 238

TABLE POLICYREC 43061 xcur 376

SQL> alter system flush buffer_cache;

System altered.

SQL> select o.OBJECT_TYPE, substr(o.OBJECT_NAME,1,10)

objname , b.objd , b.status, count(b.objd) from v$bh b,

dba_objects o where b.objd = o.data_object_id and

o.owner = 'NYUSER' group by o.object_type,