The Best Oracle Resource on the Web |
Measuring Data Segment Statistics
by Arup Nanda
|
Oracle 9i Release 2 provides a very useful way to find out performance
metrics at the segment level, hitherto impossible, enabling DBAs to
pin point problems to a specific segment.
Toolbox:
Oracle 9i Release 2 RDBMS. No special tools needed.
User Expertise Required:
Advanced DBA Skills.
The biggest problem faced by any Database Administrator (DBA) trying to
diagnose a problem is the lack of system statistics at a very granular
level. STATSPACK report gives a very detailed performance metrics profile
but that is at the system level only. Although that provides enough
information on the overall health of the system, it does not provide the
DBA with the information on specific objects that experienced or
contributed to the problem, especially in areas where the problems are
storage- and data access-related. For example, a typical performance
problem is caused by heavy buffer access activity that might be the result
of a lopsided index or just plain data distribution in a table, producing
a wait event called "buffer busy waits." The STATSPACK report or a peek
into the v$sysstat view alerts
the DBA that such an event occurred, but it does not indicate the specific
object on which that event occurred, leaving the analysis in limbo. If
that event occurs when the analysis is going on, then the exact segment
can be ascertained from the P1 and P2 parameters of
v$session_wait view. However,
as it usually happens, the suffering sessions are either completed or
blown out to conserve resources, and, thus, the evidence disappears along
with them.
Oracle 9i Release 2 provides a set of very useful performance views that
allow drilling down to the segment level, not just system level, even
after the event has occurred. For instance when you see a lot of buffer
busy wait events in a STATSPACK report, you can then drill down further to
find out which segments contributed to this wait event. This enhances the
problem solving process immensely as the segments can be identified
without real-time monitoring and can then be tuned further.
This article will explore such instrumentation and will present to the
reader — specifically the DBA troubleshooting the performance problem —
the means necessary to find out the wait events on the specific segments.
Background / Overview
First, I will illustrate the methodology with a case study in tuning. I
will start with basic tuning methodology in practice today using STATSPACK
report and then accentuate the findings from the data collected from the
new views. A typical STATSPACK report has the following lines:
Avg
Total Wait wait
Waits
Event
Waits Timeouts
Time (s) (ms)
/txn
. . . .
buffer busy
waits 3400
0
30 8.8
11.2
. . . .
It shows that the buffer busy waits event occurred 3400 times. In order to
tune the buffer busy waits, we could do a lot of things — we could
increase the freelist groups and freelists of the segment, or we could
rearrange the distribution of the rows in the table in such a way that the
blocks are not repeatedly picked up at the same time from two different
session. However, to do either of these, it's necessary to know the exact
segment that to be tuned. The STATSPACK report does not tell us which
objects contributed to the buffer busy waits event, and without the
segment level information, the tuning cannot possibly continue.
Traditionally, we would place event 10046 for each of the sessions and see
all the wait events in the generated trace files, which tend to be
extremely large. In a typical system, which may contain several hundred
applications, this approach may not be feasible. Additionally if the
applications connect through Multi Threaded Server, it becomes difficult
to isolate single segment level problems even if trace analysis is
possible.
This information is now obtained from the new performance view
v$segstat and
v$segment_statistics.
Setting the Statistics Levels
In order for Oracle to collect those statistics, you must have proper
initialization parameters set in the instance. The parameter is
statistics_level and is set in
the init.ora. The good news is
that this is modifiable via ALTER SYSTEM command and some underlying
parameters are even modifiable via ALTER SESSION. This parameter can take
three values:
BASIC:
At this setting Oracle des not collect any stats. Although this is not
recommended, you may decide to set this in a fine-tuned production system
to save some overhead.
TYPICAL:
This is the default value. In this setting, Oracle collects the following
statistics.
§
Buffer Cache - These statistics advise the DBA how to tune the multiple
buffer pools. The statistics can also be collected by setting another
parameter db_cache_advice
independently using initialization file, stored parameter file, ALTER
SYSTEM or ALTER SESSION. If it's independently set, that setting takes
preference over the statistics level setting.
§
Mean Time to Recover - These statistics help the DBA set an acceptable
Mean Time to Recover (MTTR) setting, sometimes due to the requirements
from Service Level Agreements with the users.
§
Shared Pool Sizing - Oracle can provide valuable clues to size the shared
pool effectively based on the usage and these statistics provide
information on that.
§
Segment Level Statistics - These statistics are collected at the segment
level to help determine the wait events occurring at each segment. We are
interested in these statistics.
§
PGA Target - These statistics help tune the Program Global Area
effectively based on the usage.
§
Timed Statistics - This is an old concept. The timed statistics were
enabled in earlier versions with the initialization parameter
timed_statistics. However, the
statistic was so useful that Oracle made it default with the setting of
statistic_level. It can be set
independently, too; and if set, overrides the
statistics_level setting.
ALL:
In this setting al the above statistics are collected as well as an
additional two.
§
Row Source Execution Stats - These statistics help tune the sql statements
by storing the execution statistics with the parser. This can provide an
extremely useful tool in the development stages.
§
Timed OS Statistics - Along with the timed statistics, if the operating
system permits it, Oracle can also collect timed stats from the host.
Certain operating systems like Unix allow it. It too can be set
independently; and if set, overrides the
statistics_level setting.
If you set these via any of the three methods, Initialization File, ALTER
SYSTEM or ALTER SESSION, you can find out the current setting by querying
the view v$statistics_level as
follows:
SELECT
ACTIVATION_LEVEL, STATISTICS_NAME, SYSTEM_STATUS, SESSION_STATUS FROM
V$STATISTICS_LEVEL ORDER BY ACTIVATION_LEVEL, STATISTICS_NAME;
The output is placed in Listing 1(http://www.dbazine.com/code/Listing1.txt).
So, set the statistics_level to
TYPICAL either by ALTER SYSTEM or by an initialization parameter file. Do
not forget to restart the database if you choose the latter.
Segment Level Statistics Collection
Now that we have set up the collection, let's examine what we can get from
there. The main dynamic performance view that is populated is called
v$segstat. Here is a
description of the view.
Column Explanation
TS#
|
Tablespace Number, corresponds to the TS# column in SYS.TS$
|
OBJ#
|
The Object ID, which corresponds to the OBJECT_ID in
SYS.DBA_OBJECTS
|
DATAOBJ#
|
It corresponds to the DATA_OBJECT_ID in SYS.DBA_OBJECTS
|
STATISTIC_NAME
|
The most important one, the name of the statistics we are
interested in STATISTIC# A unique number to denote each statistics
above. This is NOT the same as the V$SYSSTAT statistics number.
|
VALUE
|
The current value of that statistic. Please note the value is
cumulative, just like the statistic values in V$SYSSTAT. If you
drop the segment and recreate it, the value is reset.
|
As you can see, the columns are somewhat cryptic. Oracle provides another
view called v$segment_statistics
which is based on the above view. This view has a lot more columns and
is more descriptive with respect to the object identification. In addition
to columns like the main view, it also references the names of the
tablespace, the object, and the owner etc. so that the user can quickly
join the view with actual names.
However this view is a little slow. It's a better idea to get the
object_id from the
dba_objects and search based on
that. Here is the description of the columns of the
v$segment_statistics view that
are not present in the v$segstat
view. The other columns are the same as in
v$segstat.
OWNER
|
The owner of the segment
|
OBJECT_NAME
|
The name of the segment
|
SUBOBJECT_NAME
|
If the above is a table with partition, each partition has
separate statistics. The partition is referred to as sub-object.
|
TABLESPACE_NAME
|
Tablespace where the segment resides
|
OBJECT_TYPE
|
Type of the segment, TABLE, INDEX, MATERIALIZED VIEW, and so on.
|
To find out what all statistics are collected, you can check the view
v$segstat_name which describes
the statistic name and the number.
Examining Detailed Statistics
Now we will dive in to examine the actual statistics that we populate.
Since it lets us examine stats for a specific object, we can query like
the following:
SELECT
STATISTIC_NAME, VALUE FROM V$SEGMENT_STATISTICS WHERE OWNER =
'SCOTT' And OBJECT_NAME = 'SALES';
This provides an output similar to
Listing 2 (http://www.dbazine.com/code/Listing2.txt) Most of these
wait events are self-descriptive. Once again, these are cumulative; so,
the numbers go up as more operations continue on that segment. In
addition, like any system level statistics, these statistics are deleted
when the database is shutdown.
These segment level statistics break down the mystery surrounding the
statistics collected from v$sysstat
or from STATSPACK reports. When baffled with a number of wait events
that have already happened, the DBA can fall back on these statistics to
dig deeper and identify the exact segments that experienced these waits
which in turn contributed to the overall system wide wait for that event.
Improvements
With these basics placed in already, let's try to improve the collection
and reporting methods to further refine the performance examination. This
can be done by creating our own view in the same line as that provided by
Oracle but with a little enhancement. Examining the view definition of
v$segment_statistics, we note
that the view refers to an internal table called
x$ksolsfts. This internal table
has a very useful column - the time when the statistics were collected.
This column, FTS_STMP, can be used to our advantage to provide further
information on the wait events. A new view, called
segstat_with_time is built from
the definition of the
v$segment_statistic, identical to it except for the inclusion of a new
column called TIME_STAMP. The view creation script is provided in
Listing 3
(http://www.dbazine.com/code/Listing3.txt). The TIME_STAMP column can let
you know if the statistics are stale and help you decide whether you
should rely on them completely. The other important column this view adds
is the INSTANCE_ID, which identifies the instance in a Real Application
Cluster (RAC) environment. This view also takes away all but the most
useful columns.
Case Study
The usefulness of the segment level statistics can be best illustrated by
a case study. Here we will create a wait scenario and then diagnose that
with the segment level statistics. In the process, we will discover the
facilities brought forth by Oracle 9i Release 2 that were missing in
earlier releases. Please note that although the case study simulated the
problems as expected when tested by the author, it is not guaranteed to
produce the same behavior elsewhere. However, it should be able to help
the reader understand the methodology.
Our example system is of OLTP nature. We are seeing consistent performance
degradation and the objective of the exercise is to identify the problem
and eliminate it. We have taken STATSPACK reports and they show high waits
for "buffer busy waits" event. However, since the report does not provide
information on specific tables or indexes that experience these waits, we
can't start the process of segment tuning. Under Oracle 9iR2 this is
possible.
For the sake of demonstration, we have a table called SALES. The table is
created as per the script in
Listing 4 (http://www.dbazine.com/code/Listing4.txt). We will
initially populate the table using a script in
Listing 5
(http://www.dbazine.com/code/Listing5.txt). Examining closely the
Listing 5, you will notice that
the customer_id column values
are loaded one bunch at a time, making the records of a particular
customer_id concentrated in a
few blocks. Therefore, during an update where the records are picked up in
the customer id sequence, they will be very much likely to be picked from
the same block by two different sessions. The test case transaction is
described in Listing 6
(http://www.dbazine.com/code/Listing6.txt), named
stress.sql. This program, a
simple PL/SQL script updates records with either the odd or even numbered
sales_trans_id depending upon
the parameter passed to it, for each
customer_id from 1 to 60. This
script is run from two different sessions.
The parameter passed is 1 from one session and 2 from the other, e.g.
@stress 1. If the sessions are kicked off at the exact same time, both
sessions will operate on the same
customer_id but on different records due to the odd and even numbered
sales_trans_id values,
eliminating locking. However, both sessions will most likely try to update
the records in the same block, because the records are arranged in the
customer_id order and both the
scripts access the records for the same
customer_id. This will create a
buffer busy waits scenario that we will identify and eliminate.
Once the table is loaded, execute a STATSPACK report collection. Typically
in a production scenario, you would have enabled the jobs to run STATSPACK
regularly. To collect the statistics, you would have to login as the
STATSPACK user, usually PERFSTAT and issue a command EXECUTE
STATSPACK.SNAP. This provides your baseline collection stats.
Now run the stress script from two different sessions, with parameter 1 in
one session and 2 in other. For attaining the same time execution, kick
them from a scheduler like cron in UNIX or AT command in Windows. After
they are run, collect the STATSPACK statistics again by issuing EXECUTE
STATSPACK.SNAP. To generate the report, run the script
spreport.sql under
$ORACLE_HOME/rdbms/admin directory which will ask you the
snap_id for the collections.
Give the snap_ids just before
and after the stress script. An excerpt from the generated report has been
provided in Listing 7
(http://www.dbazine.com/code/Listing7.txt). Under the Section "Top 5 Timed
Events", we note that "buffer busy waits" is one. The system waited 3378
times for 49 seconds, about 2.83% of all the waits times.
Armed with the information we have to unearth the segment that experienced
this wait event. Before Oracle 9iR2, it was impossible. In 9iR2, if you
have setup the statistics collection by specifying
statistic_level initialization
parameter, then it is trivial. You would issue the following query:
SELECT OWNER,
OBJECT_TYPE, OBJECT_NAME, VALUE FROM V$SEGMENT_STATISTICS WHERE
STATISTIC_NAME = 'buffer busy waits'
The result is something like this. Of course, you may see a lot more in
your environment.
OWNER
OBJECT_TYPE OBJECT_NAME
VALUE
-----
----------- -----------
------
SCOTT
TABLE
SALES
3302
What we see here is the buffer busy waits were experienced by the table
SALES owner by user SCOTT. The figure 3302 also roughly corresponded to
the figure we obtained from the STATSPACK report. You immediately know
that the problem lies in the table SCOTT.SALES. In Pre-9i Release2 Oracle
databases, this information would have been impossible to get. In an
actual production system, you would probably see a lot more tables with
the buffer busy waits and the sum of all will correspond to the figure
obtained from STATSPACK report. This gives the DBA ability to pin down the
segment either that is a victim of a wait event or a creator of one and to
take corrective action.
Solution
In the above example since we identified the offending segment, we will
take corrective steps to fix the problem. If you notice the buffer busy
waits were because two sessions were trying to update the same block at
the same time. This can be easily solved by making the distribution more
even. In addition, by making sure a block is less packed, we can reduce
the likelihood that a block will become hot. As a solution, we will
recreate the table with smaller
pctused and larger initrans
and maxtrans parameters. This
will make the table less dense. The table creation script is provided in
Listing 8
(http://www.dbazine.com/code/Listing8.txt). Next, we will load the table
in a different way as listed in
Listing 9 (http://www.dbazine.com/code/Listing9.txt). Examine the
script closely. It loads the
customer_id values one after another until the maximum of 60 is
reached and the cycle is repeated. This type of loading eliminates the
likelihood that a particular block will be chosen at the same time by two
sessions if the customer_id is the same.
After this change, execute STATSPACK.SNAP again and note the value of
VALUE in v$segment_statistics
for the table SLAES. Since the value is cumulative, you will need a
reference value to compare. Now run the
stress.sql script from two
sessions the same way before, with parameter 1 and 2. Finally, take
STATSPACK reports again and see the buffer busy waits statistics. It will
be much less. Now examine the
v$segment_statistics view for the table SALES; it should be much less
too.
Conclusion
Oracle 9i Release 2 provided one of the best tools a DBA can possibly
have, to drill down to the segment level for analysis and diagnosis of
wait events, even after the fact that the wait event was experienced. This
tool goes a long way in performance related troubleshooting, which was
impossible till this time.
Some common wait events like free buffer waits, etc. are not present in
the v$segstat, Hopefully Oracle
will provide them in the future releases. This is no doubt an important
step in the direction where performance diagnosis becomes a little easier
for the DBA community.
For more information here are some links to learn more: Oracle 9i Release
2 Manuals at http://otn.oracle.com/docs/products/oracle9i/doc_library/release2/index.htm.
Search on v$segstat or
v$segment_statistics for more
information.
|