If you've got an interest in Oracle performance
tuning, you'll probably have seen references to
something called event 10046. Event 10046 is an
undocumented, but well known feature within the
Oracle database that generates detailed information
on how a statement is parsed, what the values of
bind variables are, and what wait events occurred
during a particular session. Activating Event 10046
gives us a more detailed trace that that provided by
alter session set
sql_trace=true
and lets the DBA determine
exactly why the database is performing in a certain
way, by documenting how a statement is parsed and
noting what wait events occured during the statement
execution.
Wait events are the things that stop
your SQL statement being processed quickly, and are
typically caused by statements being hard parsed,
data having to be retrieved or written to disks
which are suffering from contention, and other such
events which in theory can be minimised in a well
designed and tuned system. You've probably come
across the top five wait events section in STATSPACK
which gives you an idea which parts of your system
are causing bottlenecks, and activating extended
trace using event 10046 allows us to analyze each
individual wait event, in the context of the set of
SQL statements executed in a particular session.
Two of the main advocates for using event 10046
are Cary Millsap and Jeff Holt, who's company,
has
kindly made available a detailed paper on analysing
Oracle performance using event 10046. Entitled
"Oracle System Performance Analysis Using Oracle
Event 10046", the paper introduces the technique
as:
"The Oracle database kernel is
instrumented with over four hundred so-called
“pseudo-error debugging events.” [1] One of the
most important of these events for the system
performance analyst is event 10046, “enable SQL
statement timing.” Activating this event for an
Oracle session instructs the Oracle kernel to
print detailed timing information for that
session to an Oracle trace file. A sample of
10046 data is provided later in this document.
Event 10046 is an attribute of an Oracle session
that you may set at different levels, depending
upon the type of diagnostic output you wish to
obtain. Many database administrators are already
familiar with Oracle's sql_trace facility, which
emits performance information about Oracle
parse, execute, fetch, commit, and rollback
database calls. Using sql_trace is actually the
equivalent of using event 10046 set at level 1.
However, there are entire categories of system
performance problems that cannot be diagnosed
with level-1 data alone. For example, you cannot
unambiguously diagnose and repair a system
plagued by contention for latches, locks,
networks, or even storage devices by looking
only at SQL trace data. Event 10046 takes Oracle
performance instrumentation a significant step
further by detailing the Oracle kernel's
executions of over two hundred internal function
calls."
Activating event 10046 is actually quite
straightforward for your current session, and takes
the form of
alter
session set events '10046 trace name context
forever, level num';
Where num is either 1,4,8 or 12, with 1 being the
same as a normal set sql_trace=true, 4 including
values of bind variables, 12 including wait events,
and 12 including both bind variables and wait
events.
The paper then goes on to explain how event 10046
is activated and deactivated, and then how to find
and interpret the trace file. Interpreting the trace
file seems to be the interesting bit, with several
approaches possible. The most basic way is to go
through the raw trace files manually, with the
article pointing to a useful MOSC note No.
39817.1 entitled "Interpreting
Raw SQL_TRACE and DBMS_SUPPORT.START_TRACE output"
which goes through the basics of interpreting the
file. Also, If you've got Tom Kyte's "Expert
One-On-One" book, turn to Chapter 10 page 464 where
there's a whole section on using and interpreting
raw trace files. Of course you can also put the
results through TKPROF, which from version 9.0.1
also processes trace results generated using the
10046 event. Interpreting the trace file is of
course where the skill lies and beyond the scope of
this article, but take a look through the article
and Tom's book if you've got it, or just do a google
search for
"10046 event interpreting" and you'll come
across plenty of useful examples.
The reason that so much attention has been placed
on the event 10046 is that it gives us an
opportunity to tune performance by examining
individual wait events, rather than relying on the
traditional tuning ratios that often hide the real
cause of poor database performance. As such, it's
quite a revolutionary approach and not one you'll
generally find in official Oracle documentation.
However, if you're interesting in reading more about
event 10046, take a look at Cary Millsap's and Jeff
Holt's book
"Optimizing Oracle Performance" which is the
reference text for wait event tuning, together with
a forthcoming book by Stephen Andert called
"Oracle Wait Event Tuning" which looks due out
in October 2004. OTN also ran an article by
Robin Schumacher
entitled
"Exploring the Oracle Database 10g Wait Interface"
on some new V$ views in Oracle Database 10g
that provide an interface into database wait events,
and
this article by Thiru Vadivelu gives a nice
introduction to SQL tracing in general.
Lastly, as an alternative to generating and
interpreting (what can be) cumbersome raw trace
files, Don Burleson recently wrote an Oramag article
entitled
"Perfect Pitch" which advocates an additiona
system wait tuning approach that carries out
real-time analysis of physical I/O wait events using
views such as
V$SYSTEM_EVENT and
V$SESSION_WAIT, capturing details about the objects
that experience physical read waits, which, when
identified, can then be used together with Statspack
to identify problem SQL for tuning. According to the
article,
"System wait tuning
has become very popular because it can show you
those wait events that are the primary
bottleneck for your system. Some experts like
the 10046 wait event (level 8 and higher)
analysis technique and Oracle MOSC now has
an analysis tool called trcanlzr.sql (see Next
Steps) to interpret bottlenecks via 10046 trace
dumps. However, some Oracle professionals find
dumps cumbersome and prefer to sample real-time
wait events.
When doing wait analysis, it is critical to
remember that all Oracle databases experience
wait events, and that the presence of waits does
not always indicate a problem. In fact, all
well-tuned databases have some bottleneck. (For
example, a computationally intensive database
may be CPU-bound and a data warehouse may be
bound by disk-read waits.) In theory, any Oracle
database will run faster if access to hardware
resources associated with waits is increased.
This article explores a small subset of wait
analysis, but it illustrates a critical concept
of Oracle tuning: the fact that all Oracle
databases wait on some kind of system resource,
and it's the Oracle professional's job to
determine whether the database is I/O bound, CPU
bound, memory bound, or bound waiting on latches
or locks. When the source of the bottleneck has
been identified, the savvy Oracle professional
must then determine the causes of these events
and attempt to remove them."
All in all, it's an
interesting area and it certainly seems that system
wait tuning, using the event 10046 and the real-time
v$ wait event views, is the preferred 'scientific
approach' for diagnosing and improving system
performance. If you're interested in tuning, take a
look at the articles when you get a chance, and see
whether the event 10046 could help you in diagnosing
system bottlenecks.
UPDATE: I've just
remembered that
Nicholas
Goodman dropped me a line the other day to let
me know that Veritas are making several chapters
from "Optimizing Oracle Performance"
available as a free download. It's a marketing
exercise so you'll probably need to register, but
it's a good chance to take a look at part of the
book for free.