 |
Patrol Version 3.0.15
By Don Burleson
DBMS, April 1996
|
- BMC Software Inc., 2101 CityWest Blvd., Houston, TX
77042; 800-841-2031, 713-918-8800, or fax 713-918-8000.
- Pricing: Price begins at $6000, depending on the number
of consoles and managed objects.
- Minimum Requirements: Consoles: 32MB of memory, X-Windows
environment (PC or workstation), and 10MB of disk space.
Agents: 5MB of disk space.
In the fiercely competitive system monitoring tool
marketplace, vendors are striving to create products that take
care of the routine and mundane administrative tasks, freeing the
DBA and system administrator to focus on more high-level work.
With the advent of open systems and geographically diverse
networks of distributed databases, a centralized tool for
monitoring system performance is indispensable. Patrol by BMC
Software Inc. is one product that claims to meet this need.
Marketed as an alert monitor, Patrol positions itself against
products such as DB-Vision by Platinum Software Corp., and a host
of other SNMP-compliant monitors. The goal of this type of tool
is to provide an intelligent "agent" that will
constantly monitor the database (and operating environment) and
detect extraordinary conditions (called "events"). Once
detected, the event can trigger a script to automatically correct
the problem while telephoning the beeper number of the on-call
DBA. While this may sound like a lofty goal, Patrol has been very
successful at creating a framework that automates much of the
tedious monitoring from the DBA's and SA's jobs. Patrol is
friendly enough that an inexperienced operator can monitor dozens
of hosts, drilling down quickly to identify the nature of
problems.
Patrol lets you manage several different types of relational
databases from a single console, and currently supports a host of
relational databases including Oracle, Sybase, Informix, DB2/2,
OpenVMS, and CA-OpenIngres. In addition, Patrol offers special
submodules for monitoring vendor application packages, such as
Oracle Financials. This ability to provide a common monitor for
both mainframe and midrange databases should appeal to large
multivendor database sites.
For such a sophisticated product, installation is relatively
straightforward and consists of two steps: the installation of
the "console" (the host that monitors the databases),
and the installation of "agent" software on each
database server. The installation guide is compact and
well-written, and describes the procedures for loading the Patrol
installer. The Patrol installer, while serving the base purpose
of creating a Patrol environment, requires that the person
running the installer have an intimate knowledge of the pieces
of Patrol that need to be installed on the console and on each
agent.
The front end for the console is Motif-based and offers an
excellent GUI environment with a drill-down capability. In fact,
Patrol is intuitive enough that an experienced DBA can instantly
use the console to view a problem without prior training. The main screen consists of one icon for each host, and the host will
turn red to indicate that an event has been triggered within the
knowledge module. Clicking on the host icon brings up a set of
database icons, one for each database on the host, as well as i
cons for other components on the host, including file systems and
disk devices. Figure 1 shows a drill-down
view of the systems and components running on the host
"tp01." Double-clicking on the database icon shows
numerous database statistics. Despite these superior display
capabilities, Patrol's real value comes from its management of
the user-defined "rules" that tell Patrol when
something is amiss.
While the GUI is powerful, not all of it is intuitive. being
unaccustomed to Motif, one of my most confounding problems with
the Patrol GUI was getting used to using three mouse buttons --
the right, middle, and left buttons. It took me several weeks before I became comfortable with the GUI interface.
The Knowledge Module
Anyone who has ever used an alert monitor in a production
environment knows that the effective use of a tool depends upon
the tool's ability to identify serious problems. There are two
dimensions upon which to measure an alert monitor: the number of
false alerts and the number of missed alerts. This balancing act
between precision and recall requires a flexible set of
customizable rule bases. Patrol is very effective in this area
because the DBA has complete control over the threshold and alert
mechanisms.
Patrol gives the DBA complete freedom to customize the
environment. For example, a DBA might be interested in knowing
when a file system is nearly full. File systems that are
dedicated to database files will always be full and would thus
falsely alert the DBA to attend to a condition that does not
require any intervention. Patrol collects these decision rules
and calls them "knowledge modules," or KMs. You can
version and customize a KM and store it on the main console (the
global KM) or on each remote agent (the local KM).
The Patrol architecture provides a mechanism whereby the
global KM is referenced first, followed by any additional rules
that are specific to an agent. This feature is especially useful
for databases with unique characteristics. For example, in
general, full-table scans are resource-intensive and should be
avoided for online transaction systems, while full-table scans
are acceptable for batch-oriented tasks that read entire tables.
To Patrol, a KM consists of a set of "parameters"
with meaningful names such as BufferBusyRate and CacheHitRatio.
These data names describe individual measurements, and all of
these measurements can be adjusted according to the following:
- Poll Time. A parameter may be scheduled to
"fire" on a preset periodic schedule.
- Automated Recovery. Specific "actions"
can be programmed to notify the DBA and trigger actions
to automatically correct the problem.
- Output Range. These are the values that trigger an
alert condition. For example, by default, the value of
MaximumExtents will trigger a warning when a table
reaches 90 percent of available extents, and it will
trigger an alarm when it reaches 95 percent of available
extents.
- Manual Recovery. This is a knowledge base that
advises DBAs about an appropriate corrective action. For
example, the value of LibraryCacheHitRatio for Oracle
correctly advises DBAs that increasing the value of their
shared_pool_size may relieve the problem. This feature
is especially nice for the newbie DBA who is not
intimately familiar with corrective actions for database
problems.
You can manually adjust each of these parameters to reflect
specific conditions that exist at specific sites. This
customization is achieved by changing the output range values and
the polling times. For example, you could customize a
transaction-oriented system to stop polling in the evenings when
batch reporting occurs.
Note that Patrol measures system wide statistics, and not just
the behaviors of each database on the host. Patrol currently
supports Unix-level measurement on Bull, DC-OSX, DG, HP, SCO,
Sequent, SGI, Solaris, Sun4, and SVR4. With the Unix KM, a system
administrator can use Patrol to measure "swap" memory
usage, paging within the Unix buffer cache, and just about every
possible kernel component. Unfortunately, Patrol does not offer a
"Recovery" section for the Unix component. BMC may
assume that system administrators would be offended by a tool
that suggests remedies for the problems.
Customized events can also be incorporated into Patrol's KMs.
A customized event might be a backup process that is run every
Tuesday morning at 1:00 a.m. You can customize Patrol to check
for the successful completion of the backup, and trigger an alert
if a problem is encountered.
Reports
Patrol comes with a set of menu options that enable the
manager to view salient information about the system. Two
interesting menu options are Patrol's "CPU Hog
Percentage" and "all Problem users," which you can
program to detect and alert for runaway processes on the host.
For example, a process that consumes more than 30 percent of the
overall CPU may be a runaway process, and you can program Patrol
to detect and kill runaway tasks. One nice side benefit of Patrol
is that the DBA no longer needs dozen s of SQL report scripts on
each server to find out basic information such as the percentage
of use within a tablespace. Overall, Patrol's report facility is
robust and comprehensive, and replaces the need to have
additional database reports for all but t he most specific
queries.
Exception Handling
Patrol also provides preset automated recovery actions. For
example, the Oracle component allows Patrol to automatically
resize a table's NEXT ELEMENT SIZE, thereby averting possible
downtime. To illustrate this feature, imagine a customer table
that has been defined to grow in chunks of five megabytes
(NEXT=5m). If the tablespace that contained the table had only
three megabytes available, Patrol would automatically change the
value of the next extent size to NEXT=3m to allow the table to
extend one more time, buying time for the DBA to be notified so
he or she can add a datafile to the tablespace without disrupting
the system.
Because Patrol is an expert system, DBAs can program it to
replicate their decision processes, thereby capturing their
knowledge into the Patrol KMs. However, doing even some simple
tasks requires the use of the special Patrol Scripting Language
(PSL). PSL is used to extend the base functionality of the
software and automatically handle sophisticated recovery where
numerous conditions must be checked. Internally, PSL is a large
library of more than 100 prewritten functions, which are, in
turn, called from within a small framework of 10 commands. The
functions also include calls to SNMP modules for interfacing with
other SNMP-compliant tools.
Unfortunately, PSL is not a simple 4GL language. BMC
recommends that you have "a working knowledge and some
programming experience" with C, C++, Perl, Awk, or a related
language before enrolling in the BMC PSL programming course. As a
C++ and Awk neophyte, I found PSL to be cryptic and obtuse, and
it was initially very challenging to automate sophisticated
database recovery procedures using PSL.
Overall Value
In short, Patrol offers a robust and comprehensive system and
database monitoring tool. As with any powerful monitor, Patrol is
not plug-and-play and requires a significant investment in
up-front customization. However, once the framework is in place,
Pa trol delivers on its promise to continually monitor, alert,
and correct problems, removing the tedious and time-consuming
chore from the systems and DBA staffs. Many large IS shops have
justified Patrol's cost by measuring it against the time DBAs
will save by no longer having to baby-sit each of their
databases.
|