1.
Which Operating System yields better benchmark results
Windows or Linux?
I am starting with the most controversial and
difficult issue first. I freely admit that I am a UNIX and Linux
bigot, but that is because I have been doing UNIX twice as long and
like the scripting languages. Yet Windows has matured so much the
past few years as a server platform that this question seems fair
game and on everyones minds. Below in Figure 1 are the results of
the TPC-C benchmark performed on identical hardware using both
32-bit and 64-bit versions of Windows 2003 Server Release 2 and
CentOS 4 Update 3 (a free Redhat enterprise variant).
Figure 1:
Comparison Chart of TPC-C Benchmark
Looks to me like a dead heat. So whichever
operating system you are more comfortable with or already have more
system administrators for that is what you should choose.
2.
How many bits are best - 32 or 64 and can that effect the
operating system choice?
64-bit UNIX servers have been around for many
years. But 64-bit Windows has only just become a reality (yes, I
know that Windows NT ran on the DEC Alpha, but that never really
became mainstream). I have been partial to AMDs Athlon-64 and
Opteron processors; that is, until mid 2006, when Intels 2nd
generation dual core CPUs came out and performed so amazingly. So
now I will go with whichever hardwares current price gives me the
most bang for the buck with energy consumption and room cooling
being included secondarily in the TCO calculations. But do 64-bits
really make a noticeable difference? According to the chart in
Figure 1, it does not. That is because 64-bits primary advantage is
increased addressable memory. On the next page in Figure 2 are the
results of the TPC-C benchmark performed once again, but now with
increasing amounts of total system and database allocated memory.
Figure 2: TPC-C
Comparison Chart with Increasing Memory
Once again, there are some very clear results.
If your server has 2 GB or less, then there is really no discernable
difference. But as your servers memory increases beyond 2 GB, now
the 64-bit advantage comes into play. Even though some databases
like Oracle have 32-bit linkage options to trick-up the database
into accessing slightly more memory (known as the large memory
model), they only increase it up to a certain point. It is clear the
extra memory for both system and database makes ever increasing
performance improvements a genuine reality. So for anything over 4
GB, it is a no-brainer go with 64-bits. However, I must mention
one caveat: sometimes 32-bit Linux works better with certain
hardware such as drivers and iSCSI, and newer database options (e.g.
ASM, OCFS, etc).
3.
Which database benchmarks best Oracle 10g, SQL Server 2005
or MySQL 5.0?
Here we go with another very controversial
question (which I limited to just the three databases I get asked
about most often I am not skipping DB2-UDB, PostgreSQL, or any
other database as an intentional slight). Once again, I will own up
to my prejudices right up front: I have been doing Oracle for 22+
years, so I am an acknowledged Oracle bigot. I also should mention
that the database vendors generally frown upon posting benchmarks,
especially comparative ones. But nonetheless, this one question gets
asked all the time. Thus, below (Figure 3) are the results of the
TPC-C benchmark performed once again, but now for just those three
databases asked about most often.
Figure 3:
Comparison Chart of Three Databases for Benchmarking
Well, I will not risk any vendors ire because
the performance results are again a dead heat. Consequently,
whichever database you are more comfortable with or already have
more database administrators for that is what you should choose.
Of course, there are also the cost differences amongst the vendors,
but since no one ever pays list price, it is hard to give accurate
TPC-C ratings that include those subjectively variable costs. So by
sticking just to the technologies themselves and their relative
benchmark performance, we have yet another tie!
4.
How does one determine the maximum concurrent OLTP users a server
can sustain?
This is aways a tough question to answer because
people usually want to hear something like a Dell 1850 can handle N
concurrent users. But even servers in the same family and with the
same amount of memory can vary by number of CPUs, CPU clock speed,
CPU cores, and cache sizes. So it is not easy to compare servers
unless you compare nearly identically configured boxes. Plus, you
also need to compare identical network and disk I/O scenarios.
Assuming you do that, then the question is how to read the benchmark
results to accurately decide what the maximum concurrent user load
is for that server. On the following page (Figure 4) are the results
of the TPC-C benchmark performed yet once again, but now for just
one server where we need to determine the inflection point, i.e. the
point where the user load begins to negatively affect the response
time.
Figure 4: TPC-C
Benchmark Single Server Chart
If end-users require less than a two second
response time (which seems to be the number often quoted), then 200
concurrent users is the point where you should probably stop. But
the server in this example could support as many as 250 concurrent
users before the response time reaches the point of unacceptably
steep increase. Note that in this particular case, that is also
about the same point where the TPS rate begins to flatten or
decrease. It is not always this obvious because sometimes the two
inflection points do not line up so perfectly. But when in doubt,
always go with the response time for TPC-C or OLTP type
transactions.
5.
How does one determine the maximum size data warehouse a
server can sustain?
This is always a tough question to answer
because people most often want to hear something like how many Dell
1850s are needed for N terabytes. As before, even servers in the
same family and with the same amount of memory can vary by number of
CPUs, CPU clock speed, CPU cores, and cache sizes. So, once again,
it is not easy to compare servers unless you compare nearly
identically configured boxes. Plus, you also need to compare
identical network and disk I/O scenarios, especially the disk I/O,
because the TPC-H results are governed most by the number of
spindles. But again assuming that you do that, the question is how
to read the results to accurately decide what the maximum sized data
warehouse is for that server or servers. On the following page in
Figure 5 are the test results of the TPC-H benchmark for several
increasingly powerful Oracle RAC server configurations accessing 300
GB spread across multiple SANs and over 100 disks ( with many, many
thanks to Dell and their Linux testing lab for making these results
possible).
Figure 5: TPC-H
Benchmark Server Configurations Total Run Time
Figure 6:
Benchmark Server Configurations Avg. Response Time
Note that for the TPC-H you are looking at both
the total run time and average response time, which should be in
step with each other. Do not let the large time values dissuade you
the TPC-H queries are very complex and often take hours or even
days apiece for large data warehouses. In the above example (Figure
6), the best hardware setup takes about five hours to run with an
average response time of approximately four hours. However, the
actual 22 queries response time results are highly skewed by just a
few that take a majority of the time to run. So if your users can
accept potentially four-hour run times for highly complex decision
support queries, that 8 node cluster would suffice. If not, instead
of adding more nodes, purchase more spindles since it is not
uncommon for terabyte sized warehouses to have 500-1000 spindles for
optimal results.
This is an excerpt from
Oracle on VMWare:
Expert tips for database virtualization
by Rampant TechPress.