Sometimes I find the easiest way to get people
to more closely adhere to recommendations or best practices is
simply tell them the things not to do. It often is easier to do
great work when you can focus on a shorter list of what not to do.
So here are the top ten mistakes I see in the benchmarking world
avoid them!
1.
I'm using a benchmarking tool like
Quests Benchmark Factory, so that is all I need.
Wrong. I highly recommend that anyone doing
benchmarking read the specifications for whatever industry standard
tests they are going to perform. Software to automate these tests
will ask questions or present options that you cannot really define
unless you understand their context and that is defined in the
specs.
For example, the highly popular OLTP test known
as the TPC-C Benchmark (http://tpc.org/tpcc/spec/tpcc_current.pdf)
defines scale factor as follows:
Section 4.2.1:
The WAREHOUSE table is used as the base unit of scaling. The
cardinality of all other tables (except for ITEM) is a function of
the number of configured warehouses (i.e., cardinality of the
WAREHOUSE table). This number, in turn, determines the load applied
to the system under test which results in a reported throughput (see
Clause 5.4).
Section 4.2.2:
For each active warehouse in the database, the SUT must accept
requests for transactions from a population of 10 terminals.
So when a tool like Benchmark Factory asks for
the scale factor, it does not mean the number of concurrent users,
but rather the number of warehouses. Hence, a scaling factor of 300
means 300 warehouses and, therefore, up to 3000 concurrent users.
This requirement to read the spec is critical as
it will be an underlying issue for every remaining misconception and
problem that I will cover in the next few pages.
2. I
have an expensive SAN, NAS or iSCSI disk array, so I do not need to
worry about configuring anything special for I/O.
Incorrect. The size, type and nature of the test
may require radically different hardware settings, even all the way
down to the deepest level of your SAN. For example, a data
warehousing test like the TPC-H is best handled by a SAN whos
read-ahead and data-cache settings are set more for read than
write, while the OLTP TPC-C would benefit from just the opposite.
Relying on defaults can be a really big mistake.
Likewise, the SAN hardware settings for stripe
depth and stripe width should be set differently for these different
usages. Plus, the file system and database I/O sizes should be a
multiple of the stripe depth. In fact, a common rule of thumb is:
Stripe Depth >=
db_block_size X db_file_multiblock_read_count
Furthermore, selecting the optimal hardware RAID
(Redundant Array of Independent Disks) level quite often should
factor in the benchmark nature as well. Where OLTP might choose
RAID-5, data warehousing might be better served by RAID-0+1.
Finally, the number of disks can also be
critical. For example, TPC-H tests start at around 300 GB in size.
So anything less than 100 spindles at that size is generally a waste
of time. As you scale larger, 800 or more drives becomes common as
the minimum recommended setup. The point is that no SAN cache is
ever large enough for monstrous data warehousing queries workload.
I've seen up to 500% result differences
when varying SAN settings and number of disks.
3.
I can use the default operating system configuration right
out of the box.
No. Most databases require some prerequisite
operating system tweaks and most benchmarks can benefit from a few
additional adjustments. For example, I have seen from 50-150%
benchmark differences running TPC-C benchmarks for
both Oracle and SQL Server by adjusting but one simple file system
parameter. Yet that parameter is not part of either databases
install or configuration recommendations.
Now you might argue that you can skip this step
since it will be an apples to apples comparison because the
machine setup will be the same across tests. True, but why
potentially wait three times as long for worse results? Since a 300
GB TPC-H test can take days just to load, efficiency is often
critical in order to meet your time deadlines.
4.
I can use the default database setup/configuration right out
of the box.
Wrong. While some databases like SQL Server
might be universally useful as configured out of the box, other
databases like Oracle are not. For example, the default number of
concurrent sessions for Oracle is 50. So if you try to run a TPC-C
test with more than 50 users, you are already in trouble.
Likewise, the nature of the benchmark once again
dictates how numerous database configuration parameters should be
set. A TPC-C test on Oracle would benefit from init.ora parameter
settings of: