With a flood of
advertising from leading database platform providers
like Microsoft, Oracle, Sybase, and Informix,
development managers may find themselves fishing for
reliable benchmarking figures for relational database
management systems (RDBMS).
 |
I
highly recommend the book "Oracle
Benchmarking" for more
details on conducting and
interpreting Oracle benchmarks.
This book is written by numerous
Oracle tuning experts and shows
a complete method for Oracle and
SQL Server benchmarking |
Every database vendor cites benchmark data that
"proves" its database is the fastest, and each vendor
chooses the benchmark test that presents its product
in the most favorable light. So the
Transaction Processing Performance Council (TPC)
was created in an effort to provide uniform benchmark
tests.
The TPC also recognized that different types of
applications have different processing signatures.
They developed
TPC-C benchmark for OLTP benchmarks,
TPC-R and
TPC-H (formerly
TPC-DS) benchmarks for data warehouses and decision
support systems, and the
TPC-W benchmark
for Web-based systems. The TPC-C benchmark test is
considered the de facto standard for online
transaction processing (OLTP) database benchmarks,
primarily because TPC-C attempts to simulate
real-world OLTP database transactions.
Despite their
best efforts
However, the TPC was unable to prevent database
vendors from fudging benchmark results. The biggest
issue is the different hardware platforms databases
run on. For example, IBM's UDB database (formerly DB2)
only runs on the IBM 3090 mainframe, while the
Informix database only runs on UNIX. So it's difficult
to compare benchmarks for these products because of
the vastly different machine architectures.
How database
vendors fudge benchmark results
In their efforts to "prove" that their database
product is superior to the competition, database
vendors employ numerous tricks to improve the
processing speed of their benchmarks. Virtually all of
the tricks employed by database vendors involve
caching data and SQL in RAM.
Remember, database performance is all about disk I/O,
and vendors use large RAM memory regions to preload
critical benchmark components to avoid any disk
access. Some of their tricks:
- Buffering up data rows—By
preloading the data into the RAM buffers, database
can access the information thousands of times faster
then a disk I/O access.
- Storing SQL execution plans in
RAM—By preparsing and precomputing the
execution plans for the SQL, the database vendors
bypass the overhead of parsing and invoking the SQL
optimizer to generate the execution plan.
- Prejoining tables—Some
database products have special preaggregation
mechanisms to prejoin tables. For example, Oracle
has Materialized Views that can store the results of
an n-way table join, allowing super-fast data
access.
- Using high-speed CPUs and
clusters—Database vendors can
dramatically improve benchmark speeds by using
special high-speed machines and cluster
architectures.
Using tricks like these, all database vendors find
ways to make their product superior to the
competition. Therefore, most database benchmark data
should be viewed with healthy skepticism.
So how do we compare database speed?
For the manager charged with comparing
database performance, benchmark studies
present a huge challenge. While it's not
fair to say that database benchmarks are
meaningless, it's very difficult to use
benchmarks to "prove" that any single
database product is faster than another.
To see how difficult it is to compare
database performance, just look at the
TPC publication on a variety of TPC-C
benchmarks for various hardware and
database platforms.
Choosing your database
Given this wealth of confusing and
conflicting information, how do you choose
the proper database product? Of course,
performance is only one of many factors in
evaluating a database. You must also
consider the availability of trained DBAs,
the vendor's technical support, and total
cost of ownership, among other factors.
Most databases can be configured to
process hundreds of transactions per
second. Based on that, this is what we can
say about database performance:
- Hardware matters—Database
performance is largely a function of
hardware. A poorly performing database
can be made to appear fast with cached
disk arrays and super-fast processors.
Even "independent" benchmarks can be
misleading because of the varied
hardware, disk, and network
configurations.
- Knowledge is speed—Delivery
of high-speed transactions requires
knowledge of the application. High-speed
database performance is usually achieved
through complicated caching tricks and
preaggregation schemes. This requires
detailed knowledge of the application
and its I/O signatures.
In sum, it's difficult to evaluate
database performance objectively. The
savvy manager must carefully evaluate all
of the database vendors with a jaundiced
eye to separate the hype from the reality.
Oracle
benchmark references: