Lost revenue
and customer impatience with unplanned downtime can
force organizations to look at building a continuously
available architecture. However, these systems are
very expensive and require a highly trained staff to
keep them up and running. This article explains some
specific costs to implementing a highly available
system and some of the costs of not having such
systems.
Continuous availability
costs
Cost for highly available systems generally fall into
two camps: hardware and IT talent. On the hardware
side, massively paralleled replicated machines are
costly (at least five times the cost of a nonparallel
server). On the other hand, so is the army of database
administrators and architects needed to maintain the
hardware and the database management system (DBMS).
Although the software components are readily
available, the human effort for the installation,
setup, and testing of continuous availability systems
can be very expensive and time-consuming.
When using a massively parallel processing (MPP)
server, many Oracle users set up Oracle9i Real
Application Clusters (RACs) and Transparent
Application Failover (TAF). Introduced in Oracle9i,
RACs take advantage of an improved Cache Fusion
architecture and reduced I/O demands, increasing
scalability. (In future articles, I will drill down on
the drawbacks and benefits of RAC and TAF as compared
to Oracle Parallel Server.) Installation,
configuration, and testing of these products can run
into hundreds of hours. Although staffing needs can be
reduced after installation is completed, keeping top
database talent and expensive equipment running can be
a challenge in lean economic times.
As shown in Figure A, RAC with TAF can provide
a rapid response to application failover—but at a
price of extra human hours for the configuration,
setup, and monitoring.
Figure A |
 |
RAC with TAF is fast but
expensive. |
Next, let's review some industry metrics to help
development managers decide if the investment in a
highly available system is worth it.
Justifying the cost
Obviously, the monetary cost of implementing a highly
available architecture is high. However, if you look
at the big picture, it's possible to justify even a
significant expense.
As shown in Figure B, the cost of unplanned
downtime can be significant for all business segments.
For example, credit card companies could lose hundreds
of thousands of dollars per minute.
Figure B |
 |
Financial impact of downtime |
With respect to e-commerce engines and Web sites,
downtime is not just measured in lost revenue and
worker productivity but also in lost customer
goodwill. The intangible cost can easily translate
into millions of dollars as frustrated customers quit
visiting.
Manufacturing systems are also subject to very high
costs, but these are direct costs in terms of lost
sales. Lost sales for manufacturing operations often
have to do with interruption of the manufacturing
process and lost wages paid to factory workers who
were no longer able to do their jobs.
In practice, companies must take a look at the cost of
downtime relative to investment requirements of a
highly available system.
Probably worth the expense
Sure it costs money to have a high availability
system. There is hardware, software, and people.
However, when you consider the price you may have to
pay as a result of downtime (and these may not all be
direct costs), you just may find it is worth every
penny.