Oracle has become the world's most sophisticated and
flexible database, and the sheer complexity of Oracle makes it very
difficult to troubleshoot.
So, how can you learn Oracle troubleshooting skills?
First, I recommend that students have a strong background in the internal
machinations of Oracle, and understand the underlying mechanisms of the
Oracle engine. It's also important to recognize how Oracle alerts us
to problems, and how to interpret error messages.
When an end-user is hung, the problem could be
system-wide or unique to their session, and it's up to the Oracle DBA to
understand the right questions to ask and know the right places to look for
further information.
The goals of Oracle troubleshooting
The goals of Oracle troubleshooting include:
* Learn a troubleshooting methodology for use in
analyzing any Oracle database
* Understand ratio-based and
bottleneck troubleshooting analysis
* Learn troubleshooting techniques for
monitoring and optimizing memory usage
* Know how to quickly
pinpoint and resolve I/O hotspots at the database, storage and object level
* Using scripts to uncover session-related bottlenecks
*
Understand techniques for locating and fixing problem SQL
Here are some specific troubleshooting techniques for diagnosing acute
Oracle performance issues. There are also specific tip for Oracle
troubleshooting that can guide a neophyte in the right direction:
* Always check the first error message -
Oracle will frequently throw multiple errors, but it's always the first
error that will lead you to the root cause of the problem.
* Root cause analysis is not always practical
- You may need to forget what you learned in Oracle University Performance
Tuning Training. In a real emergency you may not have the luxury of
addressing the root cause of a performance problem. One hallmark of a
successful Oracle troubleshooter is the prudent application of
unconventional techniques.
* Know your OFA structure - You need to
be intimate with your OFA directories on the server so that you can quickly
locate trace files and dumps from Oracle problems when troubleshooting.
* Don't be afraid to bounce - If
you have a system-wide outage and you cannot connect to the database, you
may want to quickly attempt to collect trace dumps and bounce the instance
immediately. In the vast majority of system-wide hanging that is
related to memory issues, a bounce will quickly restore database access.
This can be an invaluable troubleshooting tool.
Oracle troubleshooting tips
Most effective DBA's have a strong ?meld? of technical and managerial
skills and this helps them become effective detectives.
This
dual strength of technical and business acumen is indispensible for
troubleshooting an Oracle database because you must understand every layer
of an application, from the mapping to the business processes to screens,
all the way down to the internals of the row placement on disk!
In
additional to spending years in college getting advanced degrees (often in
the areas of computer science or information systems) Oracle
troubleshooting experts are now expected to meld-together this huge body of
learning into a cohesive method for diagnosing and troubleshooting a wide
variety of data related problems!
One of the most stressful aspects
of the Oracle DBA job role is troubleshooting and acute performance
problem. In a serious emergency, there could be thousands of
end-users waiting for the DBA to give them access to their database and you
are right at the center of the problem! During a production
troubleshooting session you will meet all sorts of high-level executives,
all impatient and looking at their watches, continue the downtime costs and
encouraging you to work faster!
While the proactive techniques
described in this book serve to prevent future performance problems, there
is always the possibility of a rare exception, a rogue process, or an
unplanned increase in system usage.
It is also important to
understand the scope of unplanned outages. The DBA may be
troubleshooting a locked session for a single end-user, or could be
troubleshooting a system-wide outage that affects thousands of end-users.
While it's not practical to reproduce the entire Oracle Troubleshooting
book on this page, here are some specific Oracle troubleshooting tips and
tricks: