Question: I have an Oracle task that is
hanging. What causes Oracle to hang? How can I detect the
cause of an Oracle hang? How do I fix an Oracle hang problem?
Answer: Oracle hangs only when he is waiting
for a resource. It might be a latch (lock), or server resources
(CPU, RAM).
For more complete details on diagnosing a hung database, see the book "Advanced Oracle Utilities: The
Definitive Reference".
Other possibilities for Oracle hanging include:
-
Pathing hang - Make sure that
your Oracle environment variables are properly set ($ORACLE_HOME).
-
External hang - The network
being down, Kerberos security issues, SSO or a firewall issue can
cause an Oracle connection to hang.
One way to test this is to set
sqlnet.authentication_services=(none) in your sqlnet.ora file
and retry connecting.
-
Listener down/hanging - Start
by checking the listener (check lsnrctl stat).
Also, see my notes on diagnosing Oracle network connectivity
issues.
-
RAM shortage hang - Over
allocation of server resources, usually RAM, whereby there is not
enough RAM to spawn another connection to Oracle.
-
Data hanging - It is not
uncommon for an end-user session to “hang” when they are trying to
grab a shared data resource that is held by another end-user.
Debugging a
hung database
Back in Oracle 10g a hung database was a real
problem, especially if the DBA could not connect via SQL*Plus to release
the source of the hanging.
In these cases, the DBA had few options other than to force-down the
instance and warmstart it.
There is a new feature in Oracle 11g SQL*Plus
called the “prelim” option. This option is very useful for running
oradebug and other utilities that do not require a real connection to
the database.
root> sqlplus –prelim
SQL>
or
SQL>
set _prelim on
SQL> connect / as sysdba
At this point you are free to run the oradebug
commands to diagnose a hung database issue using the new
hanganalyze utility:
-
SQL> oradebug hanganalyze 3
-
Wait at least 2 minutes to give time to
identify process state changes.
-
SQL> oradebug hanganalyze 3
-
Open a separate SQL session and immediately
generate a system state dump.
-
SQL> alter session set events 'immediate trace
name SYSTEMSTATE level 10';
You can also gat a trace
dump to identify hung session details in Oracle Real Application
Clusters (RAC) as follows:
-
SQL> oradebug setmypid
-
SQL> oradebug setinst all
-
SQL> oradebug -g def hanganalyze 3
-
... Wait at least 2 minutes to give time to
identify process state changes.
-
SQL> oradebug -g def hanganalyze 3
There are several other ways to find the root cause
of an Oracle hanging issue:
-
Automatic Session History -
Check ASH during the hang (if you have purchased the license):
-
Report - Run a couple of
STATSPACK or AWR snapshot during the hang and get an
elapsed time report.
-
Check for locking - Whenever
Oracle is has a session hangs waiting on a resource, this
information can be found in the v$session view in the row_wait_file#
and row_wait_block#.
Column host format a6;
Column username format a10;
Column os_user format a8;
Column program format a30;
Column tsname format a12;
select
b.machine host,
b.username username,
b.server,
b.osuser os_user,
b.program program,
a.tablespace_name ts_name,
row_wait_file# file_nbr,
row_wait_block# block_nbr,
c.owner,
c.segment_name,
c.segment_type
from
dba_data_files a,
v$session
b,
dba_extents
c
see code depot for full scripts
where
b.row_wait_file# = a.file_id
and
c.file_id = row_wait_file#
and
row_wait_block# between c.block_id and c.block_id + c.blocks - 1
and
row_wait_file# <> 0
and
type='USER';
See these related notes on fixing Oracle hang issues:
-
-
-
-
-