|
 |
|
Oracle RAC File I/O Monitoring
Scripts
Oracle Database Tips by Donald Burleson |
This is an excerpt from the bestselling Grid
book
Oracle 10g Grid & Real Application Clusters, by Mike Ault and
Madhu Tumma.
An excellent example of this is the monitoring
of RAC file IO rates and file IO timing:
rem NAME:
rac_fileio.sql
rem
rem FUNCTION:
Reports on the RAC file io status of all of the
rem FUNCTION:
datafiles in the database.
column sum_io1
new_value st1 noprint
column sum_io2
new_value st2 noprint
column sum_io
new_value divide_by noprint
column Percent
format 999.999 heading 'Percent|Of IO'
column brratio
format 999.99 heading 'Block|Read|Ratio'
column bwratio
format 999.99 heading 'Block|Write|Ratio'
column phyrds
heading 'Physical | Reads'
column phywrts
heading 'Physical | Writes'
column phyblkrd
heading 'Physical|Block|Reads'
column phyblkwrt
heading 'Physical|Block|Writes'
column name
format a45 heading 'File|Name'
column file#
format 9999 heading 'File'
column dt
new_value today noprint
select
to_char(sysdate,'ddmonyyyyhh24miss') dt from dual;
set feedback off
verify off lines 132 pages 60 sqlbl on trims on
rem
select
nvl(sum(a.phyrds+a.phywrts),0) sum_io1
from
sys.gv_$filestat a;
select
nvl(sum(b.phyrds+b.phywrts),0) sum_io2
from
sys.gv_$tempstat b;
select &st1+&st2
sum_io from dual;
rem
@title132 'File
IO Statistics Report'
spool rep_out\&db\rac_fileio&&today
select
a.inst_id,
a.file#,b.name, a.phyrds, a.phywrts,
(100*(a.phyrds+a.phywrts)/÷_by)
Percent,
a.phyblkrd,
a.phyblkwrt, (a.phyblkrd/greatest(a.phyrds,1)) brratio,
(a.phyblkwrt/greatest(a.phywrts,1))
bwratio
from
sys.gv_$filestat a, sys.gv_$dbfile b
where
a.inst_id=b.inst_id
and
a.file#=b.file#
union
select
c.inst_id,c.file#,d.name, c.phyrds, c.phywrts,
(100*(c.phyrds+c.phywrts)/÷_by)
Percent,
c.phyblkrd,
c.phyblkwrt,(c.phyblkrd/greatest(c.phyrds,1)) brratio,
(c.phyblkwrt/greatest(c.phywrts,1))
bwratio
from
sys.gv_$tempstat c, sys.gv_$tempfile d
where
c.inst_id=d.inst_id and
c.file#=d.file#
order by
1,2
/
spool off
pause Press
enter to continue
set feedback on
verify on lines 80 pages 22
clear columns
ttitle off
SEE CODE DEPOT FOR MORE SCRIPTS
The output from
the above script looks like the following:
Date:
01/22/04
Page: 1
Time: 01:38
PM File IO Statistics
Report TSTDBMRA
tstdb
database
Physical
Physical Block Block
File Physical
Physical Percent Block Block Read Write
INST_ID File
Name Reads
Writes Of IO Reads Writes Ratio Ratio
---------- ----- ---------------------------------------------
---------- ---------- -------- ---------- ---------- -------
-------
2 1
/od04_01/oradata/tstdb/system01.dbf 1731
2540 .009 2717 2540 1.57 1.00
2 1
/od04_01/oradata/tstdb/temp01.dbf 195386
135805 .723 1688145 1724988 8.64 12.70
2 2
/od04_01/oradata/tstdb/undotbs01.dbf 524
523 .002 524 523 1.00 1.00
2 3
/od04_01/oradata/tstdb/drsys01.dbf 524
523 .002 524 523 1.00 1.00
2 4
/od04_01/oradata/tstdb/indx01.dbf 524
523 .002 524 523 1.00 1.00
2 5
/od04_01/oradata/tstdb/tools01.dbf 524
523 .002 524 523 1.00 1.00
2 6 /od04_01/oradata/tstdb/undotbs02.dbf
545 23867 .053 545 23867 1.00 1.00
2 7
/od04_01/oradata/tstdb/undotbs03.dbf 524
523 .002 524 523 1.00 1.00
2 8
/od04_01/oradata/tstdb/users01.dbf 524
523 .002 524 523 1.00 1.00
2 9
/od04_01/oradata/tstdb/xdb01.dbf 530
523 .002 545 523 1.03 1.00
2 10
/od04_01/oradata/tstdb/tstdb_globald01.dbf
525 523 .002 525 523 1.00 1.00
2 11
/od04_01/oradata/tstdb/tstdb_globalx01.dbf
525 523 .002 525 523 1.00 1.00
2 12
/od04_01/oradata/tstdb/tstdb_reportd01.dbf
524 523 .002 524 523 1.00 1.00
2 13
/od04_01/oradata/tstdb/tstdb_reportx01.dbf
524 523 .002 524 523 1.00 1.00
2 14
/od04_01/oradata/tstdb/nomadd01.dbf 524
523 .002 524 523 1.00 1.00
2 15
/od04_01/oradata/tstdb/TA1d01.dbf 524
523 .002 524 523 1.00 1.00
2 16
/od04_01/oradata/tstdb/TA1x01.dbf 524
523 .002 524 523 1.00 1.00
2 17
/od04_01/oradata/tstdb/SRCd01.dbf 131430
523 .288 3762539 523 28.63 1.00
2 18
/od04_01/oradata/tstdb/SRCx01.dbf 5410
523 .013 5410 523 1.00 1.00
2 19
/od04_01/oradata/tstdb/REEd01.dbf 524
523 .002 524 523 1.00 1.00
2 20
/od04_01/oradata/tstdb/REEx01.dbf 524
523 .002 524 523 1.00 1.00
2 21
/od04_01/oradata/tstdb/CRWd01.dbf 524
523 .002 524 523 1.00 1.00
2 22
/od04_01/oradata/tstdb/CRWx01.dbf 524
523 .002 524 523 1.00 1.00
2 23
/od04_02/oradata/tstdb/LWEd01.dbf 519
519 .002 519 519 1.00 1.00
2 24
/od04_02/oradata/tstdb/LWEx01.dbf 519
519 .002 519 519 1.00 1.00
2 25
/od04_01/oradata/tstdb/perfstat01.dbf 110
110 .000 110 110 1.00 1.00
3 1
/od04_01/oradata/tstdb/system01.dbf 5870952
43328 12.920 5879481 43328 1.00 1.00
3 1
/od04_01/oradata/tstdb/temp01.dbf 2459053
1219824 8.036 22005243 15402399 8.95 12.63
3 2
/od04_01/oradata/tstdb/undotbs01.dbf 62411
601 .138 62411 601 1.00 1.00
3 3
/od04_01/oradata/tstdb/drsys01.dbf 475816
601 1.041 475816 601 1.00 1.00
3 4
/od04_01/oradata/tstdb/indx01.dbf 604
601 .003 604 601 1.00 1.00
3 5
/od04_01/oradata/tstdb/tools01.dbf 835
643 .003 1553 643 1.86 1.00
3 6 /od04_01/oradata/tstdb/undotbs02.dbf
608 707 .003 608 707 1.00 1.00
3 7
/od04_01/oradata/tstdb/undotbs03.dbf 88095
547959 1.389 88095 547959 1.00 1.00
3 8
/od04_01/oradata/tstdb/users01.dbf 3907
4289 .018 6098 5497 1.56 1.28
3 9
/od04_01/oradata/tstdb/xdb01.dbf 4370138
601 9.548 4370317 601 1.00 1.00
3 10
/od04_01/oradata/tstdb/tstdb_globald01.dbf 1547848
29866 3.446 1941544 29866 1.25 1.00
3 11
/od04_01/oradata/tstdb/tstdb_globalx01.dbf
4353943 6356 9.525 4354433 6357 1.00
1.00
3 12
/od04_01/oradata/tstdb/tstdb_reportd01.dbf
604 601 .003 604 601 1.00
1.00
3 13
/od04_01/oradata/tstdb/tstdb_reportx01.dbf
604 601 .003 604 601 1.00 1.00
3 14
/od04_01/oradata/tstdb/nomadd01.dbf 288384
601 .631 288384 601 1.00 1.00
3 15
/od04_01/oradata/tstdb/TA1d01.dbf 338417
601 .741 338417 601 1.00 1.00
3 16
/od04_01/oradata/tstdb/TA1x01.dbf 963876
601 2.107 963876 601 1.00 1.00
3 17
/od04_01/oradata/tstdb/SRCd01.dbf 3075710
936826 8.765 9782425 971945 3.18 1.04
3 18
/od04_01/oradata/tstdb/SRCx01.dbf 1315213
94012 3.078 1550400 275893 1.18 2.93
3 19
/od04_01/oradata/tstdb/REEd01.dbf 1191132
601 2.603 1191132 601 1.00 1.00
3 20
/od04_01/oradata/tstdb/REEx01.dbf 3109339
601 6.794 3109339 601 1.00 1.00
3 21
/od04_01/oradata/tstdb/CRWd01.dbf 604
601 .003 604 601 1.00 1.00
3 22
/od04_01/oradata/tstdb/CRWx01.dbf 604
601 .003 604 601 1.00 1.00
3 23
/od04_02/oradata/tstdb/LWEd01.dbf 7042322
3913365 23.933 88147193 4346731 12.52 1.11
3 24
/od04_02/oradata/tstdb/LWEx01.dbf 1381676
508355 4.129 2064523 1265528 1.49 2.49
3 25
/od04_01/oradata/tstdb/perfstat01.dbf 647
1845 .005 672 1845 1.04 1.00
The RAC I/O balance is off between the two
instances, two and three. If only instance two or only instance
three were researched, the possible I/O problem would not have been
evident.
Another RAC I/O related statistic is the I/O
timing. I/O timing would show if there are latency problems between
the nodes. The following code shows an example file I/O timing
report for RAC:
rem Purpose: Calculate IO
timing values for datafiles
col inst_id format 9999999
heading 'Instance'
col name format a50 heading
'File Name'
set lines 132 pages 45
start title132 'IO Timing
Analysis'
spool rep_out\&db\rac_io_time
select f.inst_id,f.FILE# ,d.name,PHYRDS,PHYWRTS,READTIM/PHYRDS,WRITETIM/PHYWRTS
from gv$filestat f,
gv$datafile d
where f.inst_id=d.inst_id
and
f.file#=d.file#
order by readtim/phyrds desc
/
spool off
ttitle off
clear col
SEE CODE DEPOT FOR MORE SCRIPTS
An example output from the
report above is shown below.
Date:
02/02/04
Page:
1
Time: 08:59
AM IO Timing
Analysis PERFSTAT
tstdb database
Instance FILE#
File Name PHYRDS PHYWRTS
READTIM/PHYRDS WRITETIM/PHYWRTS
-------- -----
---------------------------------------- ------- -------
-------------- --
2 10
/od04_01/oradata/tstdb/tstdb_globald01.dbf 592 11
21.8 0
1 10
/od04_01/oradata/tstdb/tstdb_globald01.dbf 632 21
20.4 0
2 23
/od04_02/oradata/tstdb/LWEd01.dbf 100027 4023
5.94 .177479493
1 17
/od04_01/oradata/tstdb/SRCd01.dbf 77626 6
3.61 0
2 24
/od04_02/oradata/tstdb/LWEx01.dbf 1801 341
1.61 .263929619
3 10
/od04_01/oradata/tstdb/tstdb_globald01.dbf 299320 6370
1.58 .195918367
3 23
/od04_02/oradata/tstdb/LWEd01.dbf 294166 31246
1.44 1.53120399
1 18
/od04_01/oradata/tstdb/SRCx01.dbf 1879 6
1.43 0
3 24
/od04_02/oradata/tstdb/LWEx01.dbf 196574 35080
1.30 1.57374572
2 17
/od04_01/oradata/tstdb/SRCd01.dbf 58099 61
1.16 0
3 1
/od04_01/oradata/tstdb/system01.dbf 688550 2071
1.10 .125060357
3 18
/od04_01/oradata/tstdb/SRCx01.dbf 186020 4
1.09 0
3 17
/od04_01/oradata/tstdb/SRCd01.dbf 504230 36
1.06 1.02777778
1 24
/od04_02/oradata/tstdb/LWEx01.dbf 8 6
.875 .333333333
1 11
/od04_01/oradata/tstdb/tstdb_globalx01.dbf 45 10
.755555556 0
1 23
/od04_02/oradata/tstdb/LWEd01.dbf 79 17
.683544304 .529411765
3 7
/od04_01/oradata/tstdb/undotbs03.dbf 60 15243
.583333333 .460145641
1 2
/od04_01/oradata/tstdb/undotbs01.dbf 29 2453
.551724138 .043212393
2 6
/od04_01/oradata/tstdb/undotbs02.dbf 33 2501
.515151515 .019992003
2 11
/od04_01/oradata/tstdb/tstdb_globalx01.dbf 65 12
.461538462 0
1 7
/od04_01/oradata/tstdb/undotbs03.dbf 7 6
.428571429 0
1 16
/od04_01/oradata/tstdb/TA1x01.dbf 7 6
.428571429 0
1 1 /od04_01/oradata/tstdb/system01.dbf
1416 248 .399717514 .008064516
2 1 /od04_01/oradata/tstdb/system01.dbf
2357 366 .391599491 .013661202
2 25 /od04_01/oradata/tstdb/perfstat01.dbf
198 6 .328282828 0
3 5 /od04_01/oradata/tstdb/tools01.dbf
174 8 .293103448 0
1 6
/od04_01/oradata/tstdb/undotbs02.dbf 7 6
.285714286 0
1 15
/od04_01/oradata/tstdb/TA1d01.dbf 7 6
.285714286 0
1 13
/od04_01/oradata/tstdb/tstdb_reportx01.dbf 7 6
.285714286 0
3 4
/od04_01/oradata/tstdb/indx01.dbf 7 4
.285714286 0
2 5
/od04_01/oradata/tstdb/tools01.dbf 7 6
.285714286 0
2 3
/od04_01/oradata/tstdb/drsys01.dbf 7 6
.285714286 0
1 20
/od04_01/oradata/tstdb/REEx01.dbf 7 6
.285714286 0
3 6
/od04_01/oradata/tstdb/undotbs02.dbf 5 18
.2 0
3 8
/od04_01/oradata/tstdb/users01.dbf 6731 4
.199227455 0
3 14
/od04_01/oradata/tstdb/nomadd01.dbf 24614 192
.188835622 .588541667
3 25
/od04_01/oradata/tstdb/perfstat01.dbf 56010 4
.185841814 0
3 3 /od04_01/oradata/tstdb/drsys01.dbf
51063 4 .181422948 0
2 18 /od04_01/oradata/tstdb/SRCx01.dbf
3562 6 .179955081 0
3 11 /od04_01/oradata/tstdb/tstdb_globalx01.dbf
468503 81 .179932679 .074074074
3 2 /od04_01/oradata/tstdb/undotbs01.dbf
6 10 .166666667 0
3 9 /od04_01/oradata/tstdb/xdb01.dbf
475840 4 .147650471 0
1 4 /od04_01/oradata/tstdb/indx01.dbf
7 6 .142857143 0
1 9 /od04_01/oradata/tstdb/xdb01.dbf
7 6 .142857143 0
1 14 /od04_01/oradata/tstdb/nomadd01.dbf
7 6 .142857143 0
1 12 /od04_01/oradata/tstdb/tstdb_reportd01.dbf
7 6 .142857143 0
1 21
/od04_01/oradata/tstdb/CRWd01.dbf 7 6
.142857143 0
2 4
/od04_01/oradata/tstdb/indx01.dbf 7 6
.142857143 0
3 22
/od04_01/oradata/tstdb/CRWx01.dbf 7 4
.142857143 0
2 21
/od04_01/oradata/tstdb/CRWd01.dbf 7 6
.142857143 0
2 20
/od04_01/oradata/tstdb/REEx01.dbf 7 6
.142857143 0
2 15
/od04_01/oradata/tstdb/TA1d01.dbf 7 6
.142857143 0
2 12
/od04_01/oradata/tstdb/tstdb_reportd01.dbf 7 6
.142857143 0
2 9
/od04_01/oradata/tstdb/xdb01.dbf 7 6
.142857143 0
2 8
/od04_01/oradata/tstdb/users01.dbf 7 6
.142857143 0
2 2
/od04_01/oradata/tstdb/undotbs01.dbf 7 6
.142857143 0
1 25
/od04_01/oradata/tstdb/perfstat01.dbf 7 6
.142857143 0
3 19 /od04_01/oradata/tstdb/REEd01.dbf
109796 4 .133611425 0
3 15
/od04_01/oradata/tstdb/TA1d01.dbf 40327 4
.132839041 0
3 20
/od04_01/oradata/tstdb/REEx01.dbf 333992 4
.121095715 0
3 16
/od04_01/oradata/tstdb/TA1x01.dbf 103495 4
.120218368 0
At this point, it is still important to look for
unbalanced RAC I/O timings between the instances.
The final example will look at RAC system
events. The following code shows the RAC report script.
col event format a30 heading
'Event Name'
col waits format 999,999,999
heading 'Total|Waits'
col average_wait format
999,999,999 heading 'Average|Waits'
col time_waited format
999,999,999 heading 'Time Waited'
col total_time new_value
divide_by noprint
col value new_value val
noprint
col percent format 999.990
heading 'Percent|Of|Non-Idle Waits'
col duration new_value
millisec noprint
col p_of_total heading 'Percent|of
Total|Uptime' format 999.9999
set lines 132 feedback off
verify off pages 50
select
to_number(sysdate-startup_time)*86400*1000 duration from
v$instance;
select
sum(time_waited) total_time
from gv$system_event
where
total_waits-total_timeouts>0
and event not like
'SQL*Net%'
and event not like 'smon%'
and event not like 'pmon%'
and event not like 'rdbms%'
and event not like 'PX%'
and event not like 'sbt%'
and event not in ('gcs
remote message','ges remote message',
'virtual circuit status','dispatcher timer') ;
select max(value) value from
gv$sysstat where name ='CPU used when call started';
@title132 'RAC System Events
Percent'
break on report
compute sum of time_waited on
report
spool rep_out/&db/rac_sys_events
select inst_id,
name event,
0 waits,
0 average_wait,
value time_waited,
value/(&÷_by+&&val)*100
Percent,
value/&&millisec*100
p_of_total
from gv$sysstat
where name ='CPU used when
call started'
union
select inst_id,
event,
total_waits-total_timeouts waits,
time_waited/(total_waits-total_timeouts)
average_wait,
time_waited,
time_waited/(&÷_by+&&val)*100
Percent,
time_waited/&&millisec*100
P_of_total
from gv$system_event
where
total_waits-total_timeouts>0
and event not like
'SQL*Net%'
and event not like 'smon%'
and event not like 'pmon%'
and event not like 'rdbms%'
and event not like 'PX%'
and event not like 'sbt%'
and event not in ('gcs
remote message','ges remote message',
'virtual circuit status','dispatcher timer')
and time_waited>0
order by inst_id,percent desc
/
spool off
clear columns
ttitle off
clear computes
clear breaks
SEE GRID CODE DEPOT FOR DOWNLOAD
Example results from the RAC
I/O script above are shown below.
Date:
02/02/04
Page:
1
Time: 01:51
PM RAC System Events
Percent TSTDBMRA
tstdb database
Percent
Percent
Total
Average Of of Total
INST_ID Event
Name Waits Waits Time Waited
Non-Idle Waits Uptime
----------
------------------------------ ------------ ------------
------------ -------------- ---------
1 io
done 222,168 2
532,399 7.930 .1285
1 CPU
used when call started 0 0
360,648 5.372 .0870
1 imm
op 168 1,812
304,377 4.533 .0735
1
control file parallel write 134,810 1
160,829 2.395 .0388
1
control file sequential read 748,737 0
106,655 1.589 .0257
1 i/o
slave wait 377,955 0
99,104 1.476 .0239
1
enqueue 574,470 0
56,854 .847 .0137
1 IPC
send completion sync 6,328 7
44,580 .664 .0108
1 wait
for master scn 272,879 0
25,184 .375 .0061
1 DFS
lock handle 65,619 0
18,470 .275 .0045
1
library cache pin 2,027 8
16,750 .249 .0040
1 db
file sequential read 56,356 0
10,377 .155 .0025
1
name-service call wait 190 49
9,280 .138 .0022
1
direct path read 119,524 0
9,210 .137 .0022
1 log
file parallel write 68,692 0
7,989 .119 .0019
1
global cache cr request 71,664 0
7,130 .106 .0017
1
process startup 145 35
5,112 .076 .0012
1 async
disk IO 497,496 0
3,636 .054 .0009
1 db
file scattered read 3,749 0
1,738 .026 .0004
1
switch logfile command 17 82
1,399 .021 .0003
2 CPU
used when call started 0 0
625,945 9.323 .1511
2
control file parallel write 134,052 1
155,664 2.318 .0376
2
enqueue 1,146,971 0
149,334 2.224 .0360
2
control file sequential read 736,589 0
89,883 1.339 .0217
2 wait
for master scn 274,211 0
24,081 .359 .0058
2
global cache cr request 308,585 0
21,361 .318 .0052
2 DFS
lock handle 70,138 0
16,284 .243 .0039
2 db
file sequential read 78,344 0
16,000 .238 .0039
2 log
file parallel write 70,637 0
9,560 .142 .0023
2 db
file scattered read 50,454 0
8,247 .123 .0020
2 IPC
send completion sync 59,587 0
5,567 .083 .0013
2
name-service call wait 97 53
5,116 .076 .0012
2
direct path read 67,032 0
4,462 .066 .0011
2
process startup 68 43
2,904 .043 .0007
2 CGS
wait for IPC msg 4,344 0
1,632 .024 .0004
2
library cache pin 3,939 0
1,384 .021 .0003
2 db
file parallel read 3,664 0
789 .012 .0002
2 log
file sequential read 71 11
757 .011 .0002
2 row
cache lock 8,193 0
649 .010 .0002
3 CPU
used when call started 0 0
3,171,613 47.238 .7655
3 db
file sequential read 3,838,010 0
571,051 8.505 .1378
3
global cache cr request 2,670,668 0
388,165 5.781 .0937
3
control file parallel write 134,107 1
177,376 2.642 .0428
3
library cache pin 11,677 12
142,391 2.121 .0344
3
control file sequential read 979,741 0
122,439 1.824 .0296
3 IPC
send completion sync 2,378 20
47,029 .700 .0114
3 db
file scattered read 123,285 0
27,301 .407 .0066
3
global cache busy 257 105
27,044 .403 .0065
3
direct path read 135,560 0
23,154 .345 .0056
3 DFS
lock handle 75,839 0
18,137 .270 .0044
3
name-service call wait 197 49
9,683 .144 .0023
3 log
file parallel write 84,689 0
9,356 .139 .0023
3 latch
free 1,983 4
7,881 .117 .0019
3
process startup 127 48
6,037 .090 .0015
3
global cache s to x 26,158 0
3,521 .052 .0008
3
global cache open x 20,776 0
3,452 .051 .0008
3 row
cache lock 28,131 0
2,916 .043 .0007
3 log
file sequential read 654 4
2,541 .038 .0006
3 pipe
get 125 19
2,420 .036 .0006
------------
sum
7,700,701
The above example has been reduced to 20 events
per node to make displaying the report easier. Node three is using
more CPU cycles than the other nodes and is also using more
db file reads and more
global cache cr request
waits. At this point, node three should be reviewed for bad SQL.
This is an excerpt from the bestselling
Grid book
Oracle 10g Grid & Real Application Clusters, by Mike Ault and
Madhu Tumma.
 |
If you like Oracle tuning, you may enjoy my new book "Oracle
Tuning: The Definitive Reference", over 900 pages
of BC's favorite tuning tips & scripts.
You can buy it direct from the publisher for 30%-off and get
instant access to the code depot of Oracle tuning scripts. |
|