 |
|
Oracle Grid Tuning
Oracle Tips by Burleson Consulting |
Blade Servers and Oracle Grid Tuning
Blade servers are often advertised hand-in-hand with Oracle Grid
computing. It is critical to understand that blade servers are good
for programs that do not require the symmetric multiprocessing (SMP)
capabilities of large mid-range servers. For example, a blade server
would not be appropriate for a Grid
node that performs parallel query operations. That is because blade
servers are normally small, one to four CPU servers, and Oracle
parallel query works best when there are 32 or 64 CPUs for fast
full-table scans on very large tables.
On the other hand, blade servers and Grid
may be appropriate for small scale OLTP applications because the
nature of individual queries does not require multiple CPU
resources.
Blade Servers and Oracle App Servers
Blade servers are also an option for Oracle Application Server 10g
web cache servers or Oracle HTTP servers, because a new server can
easily be added into the Oracle Application Server 10g server farm.
In Oracle Application Server 10g, a rack of blade servers can be
used and Oracle Web Server and Oracle HTTP server (OHS) software can
be pre-installed. At runtime, the Oracle Application Server 10g
administrator can add these server blades to their Oracle
Application Server 10g farm, using each blade as either a Web cache
server or an HTTP server, depending upon the stress on the system.
At this point, it should be very clear that there are a myriad of
options for Oracle configurations. Given the complexity of cluster
and Grid management, the best migration path may be to first scale
up onto a large 64-bit server and add resources to the SMP on the
server. Then, for additional high availability and increased
flexibility, the scale out option can be explored.
Oracle Grid
and Grid are complex technologies for complex applications. Even
standard performance monitoring is challenging because multiple
servers and instances must be monitored within the common database.
Now that the concepts of Grid
and Grid have been presented, the following section will present
information on Oracle cache fusion and how it manages inter-instance
data block transfer.
The Revolution of Cache Fusion
Until later versions of Oracle8i, an Oracle Parallel Server (OPS)
database had to use a laborious process of copying blocks into and
out of memory and to and from disks in order to share information in
a single block between the multiple instances.
This complex and slow disk-to-disk OPS data sharing mechanism
resulted in serious performance issues if the database did not
practice some kind of application partitioning, data partitioning
and localized use.
Oracle 9i Real Application Clusters (RAC) relieved the limitations
of the OPS disk-based block transfer method, but there is still
performance issues related to the cache fusion layer.
Cache fusion has several important jobs, and the foremost job is
being able to maintain cache coherency and read consistency between
database instances.
This cache coherency is maintained through the Global Services
Directory and the various Global Enqueue processes. These processes
monitor each data cache and transfer data block, RAM to RAM across
the high-speed cluster interconnect. The architecture of Cache
Fusion is illustrated in Figure 20.5.

Figure 20.5:
The cache
fusion background processes
The cluster interconnect is the heart of cache fusion. Oracle has
global directory services to manage data blocks inside the multiple
instances and the use of intra-instance transportable locks to speed
up data block transfers. The Cluster Interconnect architecture is
illustrated in Figure 20.6.

Figure 20.6:
The Cluster
Interconnect IPC architecture
Even though data block transfers are now super fast, there is still
a high amount of overhead in the cache fusion processes. It is the
job of the Oracle DBA to devise methods to minimize this
inter-instance block transfer.
The disk subsystems within Grid
must also be managed. Most Grid
databases use Storage Area Networks (SAN) and use tools like the IBM
FastT storage manager for the Grid
nodes as shown in Figure 20.7:

Figure 20.7:
The FastT
Storage Manager
Now that the basics of Oracle cache fusion have been presented, the
following sections include information on how to tune Oracle Grid
by load balancing traffic between the nodes.
Overview of Grid
and Grid Tuning
There are only a few differences between an ordinary Oracle database
and a Grid/RAC database. Grid
and Grid yield some of the following superb abilities:
§
Ability to load balance the transaction load between
instances
§
Ability to adjust inter-instance cache communication
§
Ability to leverage parallel nodes
§
Ability to provide on-demand server resources
These points form the core of all Oracle Grid
tuning. The next section will start with a presentation of
information on Grid
load balancing as well as resource re-allocation and cache fusion
tuning. The section will wrap up with information on Grid
parallel tuning
RAC Load Balancing
Load balancing has changed radically between OPS and Grid
databases, but there are still two accepted approaches to Grid
load balancing:
§
Business processes and the associated data are segregated by
Grid
node.
§
Using the Transparent Application Failover load
balancing software, new connections are routed to the least loaded
Grid
node.
§
:
A combination of data localization and automatic techniques are
used, dedicating a set of nodes to the processing area, and
load-balancing connections within that group.
In OPS days, the expensive disk-to-disk data transfers meant that
the DBA would carefully partition the application such that
different types of applications would connect to different nodes.
For example, in a database with order entry, inventory maintenance
and customer management functions, clients from each of these areas
would be directed to a separate node. The data localization load
balancing method is illustrated in Figure 20.7.

Figure 20.7:
The data
localization load balancing method
This application-level partitioning ensured that all related data
blocks were cached on the appropriate instance, and that the
expensive disk-to-disk pinging of shared blocks between instances
was minimized.
With the introduction of Oracle9i Grid
and TAF, another load balancing scheme called automatic load
balancing became available. With this scheme, a centralized Oracle
Grid
listener with a virtual IP address can be created, and the Grid
listener will automatically direct transactions to the least loaded
Oracle instance. This scheme is illustrated in Figure 20.8.

Figure 20.8:
The
automatic load balancing method
The automatic load balancing approach has several advantages for
scalability and resources consumption because the Oracle Grid
software manages instance load. On the other hand, databases with a
large shared working set of frequently referenced data will find
high data block transfers within the cache fusion processes. Oracle
Grid databases almost exclusively use the automatic load balancing
techniques.
There is a third approach that uses a hybrid of the automatic and
localization approaches. This load balancing approach is used by
large Grid
shops where they want to load balance between a group of related
nodes, thereby getting the automatic load balancing as well as the
reduced cache fusion stress resulting from data localization. This
hybrid scheme is illustrated in Figure 20.9.

Figure 20.9:
The hybrid
load-balancing approach
The DBA’s choice of load balancing techniques depends on their
database application. If the application does not share many common
data blocks, automatic load balancing is the best choice. However,
if the application has transactions that all share the same table
rows, the client-based load balancing might be the better choice.
Managing Inter-Instance Data Block Transfers
The same data block may reside inside many SGA regions, so it is
easy to run queries against the
v$bh views on each node to find the data blocks that have
been pinged via cache fusion.
The size of this working set of frequently referenced data blocks is
important, and minimizing the amount of inter-instance block
transfers can greatly improve Grid
performance, especially if the working set is frequently updated.
The following script can be run on each instance to identify the
data blocks that currently reside within each SGA. This is from
Mike Ault’s book, Oracle 10g Grid
& Real Application Clusters - Oracle10g Grid Computing with RAC.
Oracle Grid
scripts are also available from www.dba-oracle.com/oracle_scripts.htm.
<
rac_grid_block_usage.sql
break on report
compute sum of distinct_blocks on report
compute sum of blocks on report
set lines 132 pages 57
@title132 'Block Usage Inside SGA Block Buffers'
spool rep_out\&db\block_usage
SELECT a.INST_ID,
decode(b.tablespace_name,null,'UNUSED',b.tablespace_name)
ts_name,
a.file# file_number,
COUNT(a.block#) Blocks,
COUNT (DISTINCT a.file# || a.block#) Distinct_blocks
FROM GV$BH a, dba_data_files X "dba_data_files"
b
WHERE a.file#=b.file_id(+)
GROUP BY a.INST_ID, a.file#,decode(b.tablespace_name,null,'UNUSED',b.tablespace_name)
order by a.inst_id
/
spool off
ttitle off
Some inter-instance pinging is unavoidable due to the shared nature
of almost all Oracle applications. However, there are many tricks
that can be used by the Oracle DBA to minimize the work of the cache
fusion layer. The following list shares some of these tricks:
§
Small lookup tables can be spread across many data blocks to
reduce the likelihood that another instance will need them.
§
:
Many Grid
databases that share large amounts of data blocks will perform
better with a 2k blocksize because less data will be transferred
between nodes.
§
:
The intelligent use of read only tablespaces allows the DBA to
minimize inter-instance communication, because Oracle does not have
to maintain read consistency mechanisms.
The following sections provide a more in-depth look at each of these
methods.
Block Spreading
Oracle Grid
databases that have small, frequently referenced lookup tables can
spread the rows across more data blocks. This can be illustrated
with a simple example. Suppose that there is a State code lookup
table with entries for all 50 states. Even on a 2k blocksize, all
50 rows fit into a single data block.
As competing instances access the lookup table, the same data block
must be transferred between all of the nodes, and this can cause a
huge overhead for the cache fusion background processes and slow
down the entire database.
The solution is to adjust the pctfree threshold for the
lookup table so that only a single row resides on each data block.
Now, instead of consuming one data block, the lookup table resides
on 50 separate blocks, and inter-instance communication is greatly
minimized. This solution is illustrated in Figure 20.10.

Figure 20.10:
The block
spreading approach to relieve Grid
contention
Of course, space is being wasted on each data block, but spreading
out the table blocks makes a huge difference in overall
performance. The freelist groups parameter allows multiple
segment header blocks. Multiple freelist groups speed up
insert operations because each node can acquire a separate
header block, each with its own freelist.
The freelist groups object parameter should be set to the
number of Oracle Grid
nodes that update any table and index simultaneously. For
partitioned objects and cases of segment header contention,
freelist groups may be set for non-RAC systems. This will
relieve the database from buffer
busy waits caused by segment header contention.
Blocksize Adjustment
As a general rule, the Grid
DBA should define all inter-instance shared blocks such that only
the minimum amount of space is transferred across the cache fusion
layer.
For example, if a shared shipping table with 80-byte rows will be
frequently referenced by all nodes, a transaction will only want a
single row in the table.
If the table is placed in a 32k blocksize, the entire table might
fit onto only a few data blocks. However, if it is placed in a 2k
blocksize, a smaller number of adjacent rows are transferred by
cache fusion, thereby reducing the probability that another node
will have to wait for the data block. This solution is illustrated
in Figure 20.11.

Figure 20.11:
Blocksize
adjustment to reduce cache fusion contention
Oracle Grid
supports multiple blocksizes, and the savvy Oracle DBA will adjust
the data buffers according to the demands of their application.
Read only Tablespaces
The Oracle Grid
DBA can also minimize the cache fusion overhead by using Oracle read
only tablespaces. If Oracle is aware that a data block is read
only, a great amount of overhead is saved because Oracles does not
have to monitor for competing DML and read consistency.
In larger Grid
databases, the DBA may take the trouble to locate and segregate
those blocks that are always read only. This can be achieved with
partitioned tables with only the most current partition being
updatable.
Parallel Processing and Grid
Performance
Parallel execution involves dividing a task into several smaller
tasks and working on each of those smaller tasks in parallel. Oracle
Parallel Query (OPQ), where multiple CPUs on a single instance can
speed up large-table full-table scans, is already a familiar
concept; however, Grid
opens up a new area of inter-instance parallelism.
There are two ways to speed up tasks:
§
Increasing the number of CPUs and use Oracle Parallel
Query (OPQ), parallel DML, etc.
§
Manually break down a complex task into multiple sub
tasks and assign each component to multiple processors to execute
them concurrently.
In the first scenario of OPQ, a single user task, such as a SQL
query, can be parallelized to achieve higher speed and throughput by
using multiple processors. Generally, Oracle’s Intra-query parallel
execution improves performance for:
§
Queries with large object full-scans
§
Creation of large indexes
§
Bulk inserts, updates, and deletes
§
Data Aggregations such as computing sums and averages
§
DBA maintenance such as table and index
reorganizations
Parallel processing involves the use of multiple processors to
reduce the time needed to complete a given task. Instead of one
processor executing an entire task, several processors work on
separate tasks that are subordinate to the main task.
There are two types of parallelism that database users can utilize.
They are: inter-query parallelism and intra-query parallelism. The
differences between these two types of parallelism are outlined
below:
§
: This can be done when individual
transactions are independent and no transaction requires the output
of other transactions to complete. Many CPUs can be kept busy by
assigning each task or each query to a separate CPU. This is
accomplished automatically by the server when the application
submits the tasks and waits for each subsection to report back its
results.
§
: - To speed up execution of a large,
complex query, it must first be decomposed into smaller problems.
These smaller problems must be executed concurrently, in parallel,
by assigning each subproblem concurrently to its CPUs. This
intra-query parallelism is implemented with the Oracle parallel
features such as Oracle parallel query.
Inter-query parallelism is an application design issue and is rare
in Oracle Grid
and Grid databases because most problems cannot be serialized into
independent subproblems.
Conclusion
This chapter focused on detailed information on the Oracle Grid
and Grid architectures and showing ways to improve the performance
of these massively parallel database systems.
The information showed that Oracle Grid provides infinite
scalability via on-demand generation of new servers into the Grid
cluster, but limitations from the cache fusion layer were examined
as were the requirement to transfer data blocks between instances.
The main points of this chapter include:
§ : Most small shops do not need
Grid
for scalability; although, they may use Grid
for continuous availability. Grid
for scalability is only for large Oracle systems with more demands
than can be met by a single server.
§
: Unlike traditional Grid
implementations with large servers, Oracle Grid computing uses small
server blades.
§
: The Oracle Grid
tuning expert is always concerned with minimizing the work of the
cache fusion processes. The most common techniques involve block
spreading, application partitioning, using small blocksizes and read
only tablespaces.
§
: The choice of data localization or
automatic load balancing is an important Grid
design consideration.
This completes the information on Oracle Grid
and Grid tuning. What follows is a summation of the Automated Work
Repository as it was presented in this book.
SEE CODE DEPOT FOR FULL SCRIPTS
 |
This is an excerpt from my latest book "Oracle
Tuning: The Definitive Reference".
You can buy it direct from the publisher for 30%-off and get
instant access to the code depot of Oracle tuning scripts: |
http://www.rampant-books.com/book_1002_oracle_tuning_definitive_reference_2nd_ed.htm
|