Find the Current I/O Session Bandits
To see which users are impacting the system in undesirable ways, the
first thing to check is the connected sessions, especially if there
are current complaints of poor performance. In this case, there are
a few different avenues that can be taken.
Getting an idea of the percentage that each session has taken up
with respect to I/O is one of the first steps.
If any session consumes 50% or more of the total I/O, that session
and its SQL should be investigated further to determine the
activities in which it is engaged. If the DBA is just concerned
with physical I/O, the
physpctio.sql query
will provide the information needed:
select
sid,
username,
round(100 * total_user_io/total_io,2) tot_io_pct
from
(select
b.sid sid,
nvl(b.username,p.name) username,
sum(value) total_user_io
from
sys.v_$statname c,
sys.v_$sesstat a,
sys.v_$session b,
sys.v_$bgprocess p
where
a.statistic#=c.statistic# and
p.paddr (+) = b.paddr and
b.sid=a.sid and
c.name in ('physical reads',
'physical writes',
'physical writes direct',
'physical reads direct',
'physical writes direct (lob)',
'physical reads direct (lob)')
group by
b.sid, nvl(b.username,p.name)),
(select
sum(value) total_io
from
sys.v_$statname c,
sys.v_$sesstat a
where
SEE CODE DEPOT FOR FULL SCRIPTS
order by
3 desc;
If the DBA wants to see the total I/O picture, the
totpctio.sql query should be used instead:
-- *************************************************
-- Copyright © 2005 by Rampant TechPress
-- This script is free for non-commercial purposes
-- with no warranties. Use at your own risk.
--
-- To license this script for a commercial purpose,
-- contact info@rampant.cc
-- *************************************************
SELECT
SID,
USERNAME,
ROUND(100 * TOTAL_USER_IO/TOTAL_IO,2) TOT_IO_PCT
FROM
(SELECT
b.SID SID,
nvl(b.USERNAME,p.NAME) USERNAME,
SUM(VALUE) TOTAL_USER_IO
FROM
sys.V_$STATNAME c,
sys.V_$SESSTAT a,
sys.V_$SESSION b,
sys.v_$bgprocess p
WHERE
a.STATISTIC#=c.STATISTIC# and
p.paddr (+) = b.paddr and
b.SID=a.SID and
c.NAME in ('physical reads','physical writes',
'consistent changes','consistent gets',
'db block gets','db block changes',
'physical writes direct',
'physical reads direct',
'physical writes direct (lob)',
'physical reads direct (lob)')
GROUP BY
b.SID, nvl(b.USERNAME,p.name)),
(select
sum(value) TOTAL_IO
from
sys.V_$STATNAME c,
sys.V_$SESSTAT a
WHERE
a.STATISTIC#=c.STATISTIC# and
SEE CODE DEPOT FOR FULL SCRIPTS
ORDER BY
3 DESC;
The output might resemble the following, regardless of which query
is used:
SID USERNAME TOT_IO_PCT
--------------------------------
9 USR1 71.26
20 SYS 15.76
5 SMON 7.11
2 DBWR 4.28
12 SYS 1.42
6 RECO .12
7 SNP0 .01
10 SNP3 .01
11 SNP4 .01
8 SNP1 .01
1 PMON 0
3 ARCH 0
4 LGWR 0
Following the above example, a DBA would indeed be wise to study the
USR1 session to see what SQL calls were made. The above queries are
excellent resources that can be used to quickly pinpoint problem I/O
sessions.
To see all the actual I/O numbers, the rather large
topiousers.sql query can be used if the goal is to
gather more detail with respect to the top I/O session in a
database:
-- *************************************************
-- Copyright © 2005 by Rampant TechPress
-- This script is free for non-commercial purposes
-- with no warranties. Use at your own risk.
--
-- To license this script for a commercial purpose,
-- contact info@rampant.cc
-- *************************************************
select
b.sid sid,
decode (b.username,null,e.name,b.username)
user_name,
d.spid os_id,
b.machine machine_name,
to_char(logon_time,'mm/dd/yy hh:mi:ss pm')
logon_time,
(sum(decode(c.name,'physical reads',value,0))
+
sum(decode(c.name,'physical writes',value,0))
+
sum(decode(c.name,
'physical writes direct',value,0)) +
sum(decode(c.name,
'physical writes direct (lob)',value,0)) +
sum(decode(c.name,
'physical reads direct (lob)',value,0)) +
sum(decode(c.name,
'physical reads direct',value,0)))
total_physical_io,
(sum(decode(c.name,'db block gets',value,0))
+
sum(decode(c.name,
'db block changes',value,0)) +
sum(decode(c.name,'consistent changes',value,0)) +
sum(decode(c.name,'consistent gets',value,0)) )
total_logical_io,
100 – 100 *(round ((sum (decode
(c.name, 'physical reads', value, 0)) –
sum (decode (c.name,
'physical reads direct', value, 0))) /
(sum (decode (c.name, 'db block gets',
value, 1)) +
sum (decode (c.name, 'consistent gets',
value, 0))),3)) hit_ratio,
sum(decode(c.name,'sorts (disk)',value,0))
disk_sorts,
sum(decode(c.name,'sorts (memory)',value,0))
memory_sorts,
sum(decode(c.name,'sorts (rows)',value,0))
rows_sorted,
sum(decode(c.name,'user commits',value,0))
commits,
sum(decode(c.name,'user rollbacks',value,0))
rollbacks,
sum(decode(c.name,'execute count',value,0))
executions,
sum(decode(c.name,'physical reads',value,0))
physical_reads,
sum(decode(c.name,'db block gets',value,0))
db_block_gets,
sum(decode(c.name,'consistent gets',value,0))
consistent_gets,
sum(decode(c.name,'consistent changes',value,0))
consistent_changes
from
sys.v_$sesstat a,
sys.v_$session b,
sys.v_$statname c,
sys.v_$process d,
sys.v_$bgprocess e
where
a.statistic#=c.statistic#
and
b.sid=a.sid
and
d.addr = b.paddr
and
e.paddr (+) = b.paddr
and
c.name in
('physical reads',
'physical writes',
'physical writes direct',
'physical reads direct',
SEE CODE DEPOT FOR FULL SCRIPTS
)
group by
b.sid,
d.spid,
decode (b.username,null,e.name,b.username),
b.machine,
to_char(logon_time,'mm/dd/yy hh:mi:ss pm')
order by
6 desc;
Output from the query above could look like the following:
A query such as this reveals details about the actual raw I/O
numbers for each connected session. Armed with this information, it
is then possible to drill down into each heavy-hitting I/O session
to evaluate what SQL calls are made and which sets of SQL are the
I/O hogs.
Even though troubleshooting I/O from a user standpoint has been
explained, one should not forget about all the system activity
caused by Oracle itself.
A cursory, global check of the system level
wait events should be performed to get an idea of the I/O
bottlenecks that may be occurring. A script like the
syswaits.sql script can be used to perform such a
check:
-- *************************************************
-- Copyright © 2005 by Rampant TechPress
-- This script is free for non-commercial purposes
-- with no warranties. Use at your own risk.
--
-- To license this script for a commercial purpose,
-- contact info@rampant.cc
-- *************************************************
select
event,
total_waits,
round(100 * (total_waits / sum_waits),2) pct_tot_waits,
time_wait_sec,
round(100 * (time_wait_sec / sum_secs),2) pct_secs_waits,
total_timeouts,
avg_wait_sec
from
(select
event,
total_waits,
round((time_waited / 100),2) time_wait_sec,
total_timeouts,
round((average_wait / 100),2) avg_wait_sec
from
sys.v_$system_event
where
event not in
('lock element cleanup ',
'pmon timer ',
'rdbms ipc message ',
'smon timer ',
'SQL*Net message from client ',
'SQL*Net break/reset to client ',
'SQL*Net message to client ',
'SQL*Net more data to client ',
'dispatcher timer ',
'Null event ',
'parallel query dequeue wait ',
'parallel query idle wait - Slaves ',
'pipe get ',
'PL/SQL lock timer ',
'slave wait ',
'virtual circuit status ',
'WMON goes to sleep') and
event not like 'DFS%' and
event not like 'KXFX%'),
(select
sum(total_waits) sum_waits,
sum(round((time_waited / 100),2)) sum_secs
from
sys.v_$system_event
where
event not in
('lock element cleanup ',
'pmon timer ',
'rdbms ipc message ',
SEE CODE DEPOT FOR FULL SCRIPTS
event not like 'KXFX%')
order by
2 desc;
The script queries the
sys.v_$system_event view and here are a few quick things to note about the output from
the waits SQL script:
§
Numerous waits for the
db
file scattered read event
may indicate a problem with table scans.
§
Many waits for the latch free event could indicate
excessive amounts of logical I/O activity.
§
High wait times for the enqueue event pinpoints a
problem with lock contention.
Once the DBA has a feel for the I/O numbers at a global level, it is
possible to begin working further down into what is really going on
below the surface.