|
 |
|
Optimal number of Oracle RAC nodes?
Oracle Database Tips by Donald BurlesonMay 3, 2015
|
Question: We are using RAC for
High Availability. What is better two nodes or three nodes? It's
been suggested that "the more nodes, the better". What is the optimal
number of RAC nodes?
Answer: First, remember that RAC
only protects against server failure, and you must still mirror the disk and
have network redundancy for guaranteed 100% availability. In my
experience, RAC is not as good as some other options for scalability (i.e. the
"scale up" approach), and the most common use of RAC is for HA, and
covering server failover.
In deciding about the number of nodes for HA,
just multiply the Mean Time Between Failure's (MTBF) for each node together and
get an overall probability of failure. It's all about covering the
probability of server failure. When used exclusively for HA, two-node RAC
is ideal, especially when the nodes are geographically distributed and connected
via super-fast dark fiber networks.
An Oracle ACE notes that two node
clusters are ideal under some circumstances:
"Well, 3 nodes can be better than two if
your workload is fully scalable, and there is no resource contention among
the three nodes. I would venture to say that the majority of RAC systems are
2 node clusters, and are set up that way as part of a HA system instead of a
workload scaling system."
Oracle Certified Master Steve Karam notes the
considerations when choosing the optimal number of RAC nodes for failover:
"Three nodes are 'better' because if one
crashes, you won't be failing the entirety of your load over to a single
solitary server. In a three node RAC, it allows two nodes to fail before the
situation is 'critical.'
Uptime is a function of many unknown variables (power to the server room,
stable memory consumption, hardware stability, etc) over time. One server =
a boundary condition that limits your chances of uptime by introducing a
constant: one unknown variable being negative at any time = total breakdown.
More servers = wider boundaries = less
emphasis on the unknown variables that can ultimately cause downtime. Now
your uptime is a function of many unknown variables per server over time.
More servers = less chance of total downtime.
At the same time, saying that more nodes is always better is false. More
nodes are better until you get close to the limits imposed by your back-end
cluster interconnect. Thus the boundary condition concept is introduced to a
new boundary condition: the number of servers you can have before your
performance tanks, which is a different equation entirely.
Interestingly enough, you could use the boundary condition argument to speak
the case for SSD (in my hastily assembled opinion). If performance is based
upon the ease with which data can travel from the database to the end user;
the unknown variables are the amount of data (block gets), the location of
data (RAM or disk), concurrency of access (latches, locks), and other
obstacles (misc. waits).
The most common tuning practice is to
reduce the number of block gets, thereby limiting the first unknown. By
using SSD we can moot the second unknown, and always ensure our data comes
from RAM. RAC will add a new variable to the 'location' unknown (RAM, remote
RAM, disk), but it limits the third unknown (concurrency) since it will be
spread across multiple machines.
It can also limit other obstacles (waits)
if the waits are node dependent (for instance, network waits would not be
limited or removed by going RAC, but px waits could). By this logic, you
could say the best system imaginable would be a RAC cluster powered by SSD
with a tiny buffer cache on each node, running well tuned SQL.
? Well tuned SQL limits the ill effects of the 'block gets' unknown
? SSD with a tiny buffer cache negates the location unknown, by nulling
disk and limiting remote RAM, leaving SSD as the main location of data.
This also reduces the ill effects of the cluster interconnect
bandwidth/latency boundary condition
? Multiple nodes limit the effects of the concurrency unknown (though
the databases = 1 boundary means hot blocks are still possible)
Thus your tuning knobs become the simple concepts of the number of nodes you
have (based on concurrency) and the number of block gets you request (based
on access paths/data requirements).
Simply put, I would say that our bottlenecks (single points of contention or
failure) are math's boundary conditions. They are the limits within which we
must operate. Except that as DBAs, we get to do something mathematicians,
physicists, etc. don't get to do: change the boundaries."
See my related notes on determining RAC node
optimization:
Market Survey of SSD vendors for
Oracle:
There are many vendors who offer rack-mount solid-state disk that
work with Oracle databases, and the competitive market ensures that
product offerings will continuously improve while prices fall.
SearchStorage notes that SSD is will soon replace platter disks and that
hundreds of SSD vendors may enter the market:
"The number of vendors in this category could rise to several
hundred in the next 3 years as enterprise users become more familiar
with the benefits of this type of storage."
As of January 2015, many of the major hardware vendors (including Sun and
EMC) are replacing slow disks with RAM-based disks, and
Sun announced that all
of their large servers will offer SSD.
Here are the major SSD vendors for Oracle databases
(vendors are listed alphabetically):
2008 rack mount SSD Performance Statistics
SearchStorage has done a comprehensive survey of rack mount SSD
vendors, and lists these SSD rack mount vendors, with this showing the
fastest rack-mount SSD devices:
manufacturer |
model |
technology |
interface |
performance metrics and notes |
IBM |
RamSan-400 |
RAM SSD |
Fibre
Channel
InfiniBand
|
3,000MB/s random
sustained external throughput, 400,000 random IOPS |
Violin Memory |
Violin 1010 |
RAM SSD
|
PCIe
|
1,400MB/s read,
1,00MB/s write with ×4 PCIe, 3 microseconds latency |
Solid Access Technologies |
USSD 200FC |
RAM SSD |
Fibre Channel
SAS
SCSI
|
391MB/s random
sustained read or write per port (full duplex is 719MB/s), with
8 x 4Gbps FC ports aggregated throughput is approx 2,000MB/s,
320,000 IOPS |
Curtis |
HyperXCLR R1000 |
RAM SSD |
Fibre Channel
|
197MB/s sustained
R/W transfer rate, 35,000 IOPS |
Choosing the right SSD for Oracle
When evaluating SSD for Oracle databases you need
to consider performance (throughput and response time), reliability (Mean Time Between failures) and
TCO (total cost of ownership). Most SSD vendors will provide a
test RAM disk array for benchmark testing so that you can choose the
vendor who offers the best price/performance ratio.
Burleson Consulting does not partner with any SSD vendors and we
provide independent advice in this constantly-changing market. BC
was one of the earliest adopters of SSD for Oracle and we have been
deploying SSD on Oracle database since 2005 and we have experienced SSD
experts to help any Oracle shop evaluate whether SSD
is right for your application. BC experts can also help you choose
the SSD that is best for your database. Just
call 800-766-1884 or e-mail.:
for
SSD support details. DRAM SSD
vs. Flash SSD
With all
the talk about the Oracle “flash cache”, it is important to note that there
are two types of SSD, and only DRAM SSD is suitable for Oracle database
storage. The flash type SSD suffers from serious shortcomings, namely
a degradation of access speed over time. At first, Flash SSD is 5
times faster than a platter disk, but after some usage the average read time
becomes far slower than a hard drive. For Oracle, only rack-mounted
DRAM SSD is acceptable for good performance:
|
Avg. Read speed
|
Avg. write speed
|
Platter disk
|
10.0 ms.
|
7.0 ms.
|
DRAM SSD
|
0.4 ms.
|
0.4 ms.
|
Flash SSD
|
1.7 ms.
|
94.5 ms.
|
|
 |
If you like Oracle tuning, you
might enjoy my book "Oracle
Tuning: The Definitive Reference", with 950 pages of tuning tips and
scripts.
You can buy it direct from the publisher for 30%-off and get instant
access to the code depot of Oracle tuning scripts. |
|