 |
|
The perils of server deconsolidation
Oracle Database Tips by Donald BurlesonNovember 18, 2015
|
It's back to the
future for the Oracle database world. The wild world of
1990's client-server technology are long gone, and Oracle shops
are re-consolidating their data resources. Back in the
1980's, the mainframe was the rule, and now that the
one-server-one-application paradigm has proven too expensive
(increased staff), they are now moving back to the centralized
architectures of their ancestors.
The 2nd Age of Mainframe Computing
The early 21st century is seeing
the
2nd age of mainframe computing, a change away from the
minicomputer hardware architectures of past decades.
Instead of small, independent servers, the major hardware
vendors are pushing large servers with transparent sharing of
hardware resources:
-
HP - With scalability up to 64
processors, the HP Integrity Superdome Server.
-
Sun - The largest Sun Fire server has
72 processors, 144 threads, and runs at 1.8 GHz, and the new "Galaxy"
servers will be even more powerful.
-
IBM - IBM offers the IBM System x3950,
that allows processors to be added (up to 32 CPU's). This allows
Oracle shops to scale-up within the same server.
-
UNISYS - The UNISYS ES-7000 series
offers a 32 CPU server capable of running dozens of large Oracle
applications
There are also compelling performance benefits to
server consolidation.
 |
Left is an example of a server
consolidation where a set of three IBM P590 Regatta class servers (each
with 18 CPU's and 128 gig of RAM) replaced several hundred
minicomputers.
Using IBM's virtualization tools,
Oracle instances can be partitioned into logical partitions (LPAR's),
while still maintaining the ability for critical tasks to use all CPU
resources for super-fast Oracle parallel query. |
Each Oracle instance on these
super-fast monolithic servers can support tremendous data
volumes, supporting over 1,000 disk writes per second and over
50,000 logical reads per second:
Load Profile ~~~~~~~~~~~~ Per
Second Per Transaction
--------------- ---------------
Redo size:
139,878.97
371.50
Logical reads:
50,364.16
133.76
Block changes:
907.73
2.41
Physical reads:
6,034.92
16.03
Physical writes:
1,070.95
2.84
The new IBM P690 servers are even
more powerful, supporting 128 CPU's and over a terabyte of RAM,
with all of the processing needs of a midsized corporation
nestled within a single server where the resources can be
transparently shared.
What Oracle hardware architecture
is best?
At Oracle Openworld 2015,
Oracle officially embraced the 2nd age of mainframe computing and the
rise of the virtual machine, via their commitment to vmware for Oracle.
Andrew Holdsworth (Senior
Director of the Oracle Real World Performance Group) gave a presentation titled
Current Trends in database tuning, (user cboracle, password=oraclec6).
Andrew notes that 99% of OLTP
applications have multiple instances per host machine, and that it's easy to
keep an instance from hogging the CPU by using the cpu_count as a
processor fence (you can set cpu_count=2 on a 32 CPU machine, and the
instance will only use two processors, and that you can use the
resource_manager_cpu_allocation to control processing resources. At
the OS level, you can also control resources with the CPU affinity features, and
you can adjust the priority of individual processes with the Linux/UNIX ?nice?
command, which changes the dispatching priority for server tasks.
Today we see several groups of
Oracle professionals who are advocate different approaches
regarding their Oracle hardware architecture decision.
Each group has valid costs and benefits:
-
One server, one application - The one
server, one application advocates note that this architecture has no single
point of failure, and a spike in one application will not adversely
affect other systems. As a tradeoff for the guaranteed isolation, no
single point of failure, the one server, one application approach has
obviously higher costs as additional administrative and DBA time is required
for the redundant management of software patches. Even more
significant, we see the extra cost of over-allocating RAM and CPU to
accommodate spikes in usage and the loss of parallel operations which
require an SMP server with multiple CPU's.
-
Grid computing - Advocates of Grid
computing have racks of re-usable "blades", tiny computers (4 CPU's, 4 gig
RAM) which can be strung together to form a scalable grid. This
architecture is the hallmark of the Oracle 11g Grid database.
The problem with Grid servers is that they
have redundant copies of the OS and RDBMS software to manage, and they must
have the software pre-installed before it can be genned-into Oracle.
See
Oracle's Ellison speaks on grid architectures for details.
-
Server Consolidation - Old-timers (like
me!), remember the benefits of being able to manage all of your applications
within a central location, and we advocate a return to the good ole days of
the mainframe, back before cheap minicomputers heralded the debacle that was
called "the age of client server computing".
Some general fallacies about each
hardware approach include some common misconceptions:
-
Fallacy: Single point of failure
- Properly configured, none of these architectures suffers any a single
point of failure. Today's hardware has fully-redundant everything, and
with geographical data replication (Streams,
Dark Fiber RAC), hardware errors are becoming quite rare.
-
Fallacy: Rogue applications can "hog"
a centralized computer - Also, the segmentation of large computers into
smaller, fully isolated slices had been done successfully for decades, and
the system administrator has the ability to "fence" RAM and dedicate CPU's
to specific applications. (Using CPU affinity, the "nice" command,
setting cpu_count and resource_manager_cpu_allocation).
Also,
vmware solutions
provide a similar solution for mixed OS environments.
The Oracle hardware architecture that you choose
is all-about minimizing cost and risk. Management favors a Oracle database
hardware architecture that favors minimum human intervention, those where the
hardware does the load balancing. Oracle estimates that human error is the
predominant cause of unplanned outages, a compelling reason to adopt hardware
architectures that minimize redundancy copies of the software, and the
associated DBA overhead.

A good hardware architecture considers human factors
Also, the redundant software issue with grid
computing has been minimized with the 11g "instant provisioning" feature.
Running the Oracle installer over-and-over is tedious and time-consuming, and
Oracle has solved this issue by allowing for instant provisioning of new
servers.

A brief history of database
hardware architectures
In the beginning there was the
mainframe, a huge monolithic computer that could support all of
the data processing needs of large corporations. These
massive servers cost millions of dollars, and the hardware was
very expensive, relative to the system management costs.

Is the mainframe
coming back?
A team of five DBA's could manage
an entire corporation. However, this was not to last.
As the age of client server computing arrived, Oracle DBA staff
quadrupled to accommodate the increased management overhead.
The truth behind Client Server
computing
Today we recognize that the movement towards client server
computing was an economic decision. Management was enticed by the low
costs, and they made a conscious decision to abandon the proven single-server
approach in favor of a distributed network of smaller computers.
In the 1990's we saw a huge movement away from the glass
house towards the new "minicomputers", small UNIX servers that cost a tiny
fraction of the mainframes.
Data processing managers began doubling the size of their
machine rooms, adding acres of real-estate to accommodate hundreds of these
itty-bitty computers. With the one-server, one-application approach,
sys-admin and DBA staff multiplied, as staff struggled to maintain hundreds of
glorified PC's.
I remember when my shop got their first HP minicomputer for
$30k, an unbelievable bargain in a world of four million dollar mainframes.
(I also remember that the Oracle license cost was nearly as much as the hardware
costs!).
Oracle Server Consolidation
It's no surprise that Oracle server consolidation
is being recommended by the major hardware vendors (HP, UNISYS, Sun), and the
market for their 32-CPU and 64-CPU mega servers has been good. Best of
all, server consolidation allows for better utilization of processing resources,
and Oracle parallel query can be employed to a much larger degree. See my
Intel UNISYS webcast on Oracle server consolidation for more details.
- Better
reliability -
Mainframe-like processors can be configured to avoid
the single-point-of-failure problem. Vendors note
that fault-tolerant components can be used to make a
single server as reliable as a distributed scheme.
They also note that Oracle's own failover tools
(Data Guard and Oracle Streams) can make systems
resilient to hardware failure.
- Better on-demand
CPU and RAM allocation -
Critics say that the mainframe-like UNIX behemoths
provide internal resource allocation that will be
more efficient than a distributed model.
- Better
scalability -
Critics noted that the server blade approach to grid
computing allocated independent units of dyadic or
quadratic processors, making Oracle Parallel Query
less efficient. A 32-CPU monolith will be able to
process a large Oracle table far faster than a grid
of small independent processors.
Server consolidation is bad for
the DBA job market because one of the main reasons for
consolidation hardware resources is the savings from reducing
Oracle DBA staff. A typical shop can save a million
dollars a year by removing a dozen DBA's.
Scale up, scale out
There are two main approaches to
scalability in Oracle. The first is the "scale up"
approach.
The
Oracle documentation notes the scale-up capabilities of the
Oracle Grid approach:
If tasks can run independently of one another, then Real
Application Clusters can distribute them to different nodes. This permits
you to achieve scale up. In essence, scale up is more processes run through
the database in the same amount of time. The number of nodes used, however,
depends upon the purpose of the system.
If processes can run faster, then the system accomplishes more work. The
parallel execution feature, for example, permits scale up.
With parallel execution, a system might maintain the same
response time if the data queried increased tenfold, or if more users could
be served. Real Application Clusters without the parallel execution feature
also provides scale up, but does this by running the same query sequentially
on different nodes.
My Oracle Server Consolidation Notes:
My Oracle Hardware notes:
See my related notes on Oracle hardware architectures: