While most Oracle professionals are familiar with the
concept of Oracle's parallel query facility, there are
other popular techniques used by e-commerce systems to
improve the throughput and availability of the Oracle8i
database.
In today's Web technology, most Oracle databases use a
three-tiered architecture:
- The client tier is a Java-enabled Web browser.
- The middle tier is a Web server with Oracle
WebServer or Apache.
- The database tier is the Oracle8i database
server.
Web-based systems have popularized this move away from
the traditional client-server model, and Oracle has been
having great success with its new WebDB product. WebDB is
an easy-to-use Web page development tool with a native API
for making calls to the Oracle database. Once the Web
screen is designed and deployed, Oracle's Web server will
manage all of the database interaction.
Within this new Oracle architecture, it is the
responsibility of the Web server to maximize the
throughout of the calls to the Oracle database. While many
shops use the Oracle WebServer product for the middle
tier, other shops build their own Oracle Web servers.
Customized Web servers can be created using the Apache
modules and customized C++ and Java extensions to make
direct calls to the back-end database. If the Web server
supports extensions in Java or C++, the POSIX library can
be used to implement multithreading.
Since the Web server is the direct interface to the Oracle
database, systems that support tens of thousand of users
over the Web must be able to maximize the throughput of
requests to the database engine. Web users demand almost
instant access speeds, and there is no time to wait for
multiple Oracle queries to complete before delivering the
results.
The most common approach for maximizing throughput is to
deliberately design the Oracle database so that multiple
SQL statements are used to gather the required
information. The Web server programs then use
multithreading code to simultaneously make the requests to
Oracle. Let's take a look at how these requests are made
by examining several approaches to parallelism.
Approaches to
parallelism
E-commerce Oracle databases use four approaches to
parallelism:
- Oracle parallel query
- Oracle Parallel Server
- Parallel DBA operations
- Manual parallelization of Oracle queries
Each has very different features and benefits, and we
will examine how each is used within Oracle. In this
article, we'll examine Oracle parallel query and Oracle
Parallel Server. I'll follow up with a discussion of
parallel DBA operations and parallelized queries tomorrow.
A little
background
As Oracle technology matured, traditional UNIX hardware
vendors began to move away from single-CPU processors and
into multiprocessing environments. The new processor
architectures now include processors that contain up to a
dozen CPUs with symmetric multiprocessing (SMP). These
architectures scale up to processors that contain
thousands of CPUs with massively parallel processors
(MPP). To maximize the power of these new platforms,
Oracle Corporation needed to take full advantage of
multiple CPUs and allow Oracle to use parallel features.
The first challenge was to support SMP architectures—CPU
configurations that consisted of a small number of loosely
coupled CPUs. Hardware vendors such as Sun (Solaris), HP
(HP/UX), and IBM (AIX) rushed to build SMP machines with
anywhere from four to 64 processors inside each box. To
exploit the parallel capabilities of these new processors,
Oracle added the parallel query feature starting with
Release 7.3 of the Oracle RDBMS.
Oracle continued to add parallel features as it evolved,
culminating in the "parallel everything" components of
Oracle8i.
Oracle parallel query
Oracle's parallel query facility is an automatic
internal tool within Oracle that is used to make
full-table scans run faster. Many neophyte Oracle
administrators turn on parallel query for a table
and are surprised to see no performance
improvement. This is because their system is not
doing many full-table scans. The important point
here is that parallel query works best when you
are reading all rows in an Oracle table. Remember
that queries that use an index or retrieve only a
small part of a table will not improve using
parallel query.
Parallel query is generally used in Oracle data
warehouse queries that require access to an entire
table, as shown in Figure A. In an Oracle
data warehouse, it is common to have queries that
need to compute the average or sum of a value, so
Oracle parallel query is quite useful in these
cases.
Figure A |
 |
An example of an Oracle
parallel query |
Oracle
Parallel Server (later RAC)
Oracle Parallel Server (OPS) is a special version
of the Oracle database that allows a single
database to have multiple Oracle instances.
Note
An Oracle instance is
the memory region that contains the control
structures and buffers.
With OPS, a massively parallel server can run
dozens of Oracle instances, each reading from a
single database. OPS is used for highly
partitioned databases, where each segment of the
database can be mapped into a specific instance.
The Integrated Distributed Lock Manager (IDLM) is
a very important component of any OPS database and
one that merits special attention. The IDLM is a
software component that links the Oracle instances
as if they shared a single memory region. Unlike a
single Oracle instance, a cluster of OPS instances
must be able to manage locking at the global
level. In other words, the IDLM helps Oracle
properly manage all of the data blocks as they are
read into all of the Oracle instances. This
creates the illusion of a single Oracle database,
and the IDLM will transfer data blocks between the
buffer caches of each instance.
The term pinging is used to describe the
process of the IDLM transferring data blocks
between Oracle instances. For example, suppose
that instance #1 has requested block 123 to update
a row. Unfortunately, block 123 is resident in the
buffer of instance #6. Rather than read the data
block from disk, the IDLM will ping block 123 from
instance #6 and make it available to instance #1.
Another common term in OPS is false ping. A
false ping happens when blocks covered by the same
hashed IDLM lock are accessed by different
instances.
In short, the idea behind the configuration of OPS
is to minimize pinging. For your database, you
should work with your database administrator to
minimize pinging by attaching specific Web servers
to specific database instances, as shown in
Figure B.
Figure B |
 |
Configuring Oracle Web
servers to minimize OPS pinging |
The idea is to place related activities on their
own Web servers and then map these Web servers to
several Oracle instances. In this fashion,
processes that require customer information use
one Oracle instance, while other processes that
require order information will use another Oracle
instance. Since data sharing between instances is
minimal, pinging is reduced.