Oracle Database Tips by Burleson Consulting

February 4,  2015

Question:  I have an Oracle task that is hanging.  What causes Oracle to hang?  How can I detect the cause of an Oracle hang?  How do I fix an Oracle hang problem?

Answer:  Oracle hangs only when he is waiting for a resource.  It might be a latch (lock), or server resources shortage (CPU, RAM). 

For more complete details on diagnosing a hung database, see the book "Advanced Oracle Utilities: The Definitive Reference".

Other possibilities for Oracle hanging include:

  • Pathing hang - Make sure that your Oracle environment variables are properly set ($ORACLE_HOME).

  • External hang - The network being down, Kerberos security issues, SSO or a firewall issue can cause an Oracle connection to hang.  One way to test this is to set sqlnet.authentication_services=(none) in your sqlnet.ora file and retry connecting.

  • Listener down/hanging - Start by checking the listener (check lsnrctl stat).  Also, see my notes on diagnosing Oracle network connectivity issues.

  • RAM shortage hang - Over allocation of server resources, usually RAM, whereby there is not enough RAM to spawn another connection to Oracle.

  • Data hanging - It is not uncommon for an end-user session to "hang" when they are trying to grab a shared data resource that is held by another end-user.

  • CPU exhausted - When the vmstat runqueue (r) column value exceeds cpu_count, you are CPU constrained.

  • Exceeding the "processes" parameter - When you exceed the number of the initialization "processes" parameter, the instance will hang, and not accept new connections.  Issue the SQL*Plus command "show parameter process".

Bouncing the database is not the answer, you need to know "why" the instance is locking up!

You may be locked up because you are hitting a system limit.  Also, make sure to check your SQL*Net log and the alert log for the database.

Check for CPU shortages

Also, check vmstat and see if the runqueue (r) column number exceed the cpu_count.
The only metric that identifies a CPU bottleneck is when the run queue (r value) exceeds the number of CPUs on the server.  See high_cpu

Check for RAM shortages

Use the vmstat output and look at the scan rate (sr) and page-in (pi) columns.  To aid in measuring real page-ins, the UNIX and Linux vmstat utility yields the scan rate (sr) column which designates the memory page scan rate. If the scan rate rises steadily, the page-stealing daemon's first threshold will be identified, indicating that that particular program's entire RAM memory regions are being paged-out to the swap disk.  This behavior can then be correlated with the vmstat page-in (pi) metric.  See tuning server RAM

Now, let's look at how to debug a hung database.

Debugging a hung database

Back in Oracle 10g a hung database was a real problem, especially if the DBA could not connect via SQL*Plus to release the source of the hanging.  In these cases, the DBA had few options other than to force-down the instance and warmstart it.

There is a new feature in Oracle 11g SQL*Plus called the 'prelim' option. This option is very useful for running oradebug and other utilities that do not require a real connection to the database.

root> sqlplus –prelim



SQL> set _prelim on

SQL> connect / as sysdba

At this point you are free to run the oradebug commands to diagnose a hung database issue using the new hanganalyze utility:

  1. SQL> oradebug hanganalyze 3
  2. Wait at least 2 minutes to give time to identify process state changes.
  3. SQL> oradebug hanganalyze 3
  4. Open a separate SQL session and immediately generate a system state dump.
  5. SQL> alter session set events 'immediate trace name SYSTEMSTATE level 10';

You can also gat a trace dump to identify hung session details in Oracle Real Application Clusters (RAC) as follows:

  1. SQL> oradebug setmypid
  2. SQL> oradebug setinst all
  3. SQL> oradebug -g def hanganalyze 3
  4. ... Wait at least 2 minutes to give time to identify process state changes.
  5. SQL> oradebug -g def hanganalyze 3

There are several other ways to find the root cause of an Oracle hanging issue:

  • Automatic Session History - Check ASH during the hang (if you have purchased the license):

  • Report - Run a couple of STATSPACK or AWR snapshot during the hang and get an elapsed time report.

  • Check for locking - Whenever Oracle is has a session hangs waiting on a resource, this information can be found in the v$session view in the row_wait_file# and row_wait_block#. 

Column host format a6;
Column username format a10;
Column os_user format a8;
Column program format a30;
Column tsname format a12;
   b.machine host,
   b.username username,
   b.osuser os_user,
   b.program program,
   a.tablespace_name ts_name,
   row_wait_file# file_nbr,
   row_wait_block# block_nbr,
   dba_data_files a,
   v$session      b,
   dba_extents    c
see code depot for full scripts
b.row_wait_file# = a.file_id
c.file_id = row_wait_file#
row_wait_block# between c.block_id and c.block_id + c.blocks - 1
row_wait_file# <> 0

See these related notes on fixing Oracle hang issues:

  1. SQL hanging on TEMP space usage

  2. Fix hanging SHUTDOWN: waiting for active calls to complete

  3. SQL*Plus hangs

  4. Export "hangs" while exporting definitions

  5. Oracle hangs and you cannot connect





