This whole
process of tuning on this Sun 12k started about 3 years ago when we got the
thing. Solaris 9 was not ready for use, and all of the apps -- MANU and
Oracle Financials slowed down about 20% when we switched to this hardware - even though it is touted as ?high end?.
It took some fancy footwork once the kernel
params were set, binding database processes to specific processors - without
this - the performance of the same job was 4:07 - 30 minute improvement,
which is 11%.
In my tests on Saturday, I was able to get yet another 8% from carefully
binding processes that were database intensive to the database processors.
In the database, log file sync is still among the top waits (as I?d expect
from the high write activity), and write IO tops out in around 12,000
requests/second with the DMX processing them at an average rate of 500
microseconds/write.
I think we need to look at the binding loads to processors, load sequencing,
and parallelizing the process for the rest of the performance gain we need
for the third plan run for inter region demand deployment.
To make a
long story short, if you look at the Sun architecture guide, there is a
third interconnect backplane, which has about 40% more latency than the
processor boards. Hence, you see the slowdown in the apps, since Solaris 8
treats all memory as monolithic, not NUMA as the box is, and - shared memory
is splattered across all of the memory.
The reason we
use this piece of hardware is that we can dynamically move processors
between the two domains. (This is a 32 processor machine, 64GB memory, and
2 domain machine.) The kicker with the inventory plan runs is that they are
75% database, and 25% heavy compute, so
the system
was skewed toward the latter, when in fact it should be skewed toward the
former ?
Anyway, with
the push to get an inter-company demand deployment run in every night,
everyone came in to sell us more - Sun - oh - another 500k of hardware (new
processor boards ?), and I started to look at the RAM-SAN, the TMS SSD.
My commentary
to everyone was - before we get new Hardware, let's get the new OS in place,
and tune. That took a long time, as Solaris 9 is a messy upgrade. And it
took several passes to get all the drivers to behave.
So, basically
what I did over the weekend was force the OS to put all the shared mem for
Oracle on one processor board, and bind all of the database processes to
that board. Last night's run gave us 1 hour 10 minutes savings over a 4
hour 8 min run. That's just about 25% -- they asked for 30% for the
additional deployment run - and I also was able to work with order to cash
system folks to get us the data about 30 minutes earlier.
If we need
another quantum leap in performance, then I would go to the 1.9GHZ dual core
chips on a smaller machine (for the db)
with
monolithic memory, and go RAM SAN. As it stands now - the EMC SAN is
keeping up with the IO requests - 500 u-sec per write supports up to 20k
writes/sec, and the RAM SAN is 200 u-sec per write which is up toe 50k
writes/sec. I know we would do better with a RAM SAN - I just don't think
enough to get the price tag through the mill ?
As the old
saying goes - bigger is not always better. There is overhead for NUMA
memory - which everyone except IBM seems to favor - the p-series is SM-MIMD
which flattens out with 32 processors (benchmarked - sun can do 2x the
number of skus in an hour that a p575 can ?) - so either you buy the panel
truck which goes faster with lighter loads, or the cement truck which takes
more to get going .
Don, thanks
again for the writings on tuning. They were invaluable in this process.
?