 |
|
Load Balancing of Server Tasks
Oracle Database Tips by Donald Burleson |
When CPU overload is experienced, the DBA will
generally see periods during the days when the run queue gets quite long and
other periods at night when the processors are mostly idle (Figure 6-3). A
common question asked by a systems administrator is, 'the CPU is 40 percent idle
for 16 hours a day, so why should we add more processors??
Figure 6-30: Average Oracle server run queue values before
task load balancing
However, there are times when it makes sense to add more
processors, even if the processors are idle during off-peak times. For example,
if you are working in an online environment, the only response time that matters
is the time between 7:00 a.m. and
8:00 p.m. when your online users
are active. The fact that the server is largely idle during the middle of the
night has no bearing on the decision to upgrade with additional CPUs.
Once we identify the times when the CPU activity is
excessive, we need to go to STATSPACK and examine the activity at the times of
the overload. Once we have identified the times when the processors are
overloaded, we must then see if it is possible to reschedule batch tasks to run
at off-peak hours. On an Oracle database server, tasks may be scheduled in many
ways:
-
The dbms_job utility
-
The UNIX cron utility
-
A TP monitor such as Tuxedo
-
Oracle Concurrent Manager
(for Oracle Applications)
-
SAPGUI if we are running SAP
Regardless of the method of scheduling Oracle tasks, the
idea is to find large batch tasks that may be scheduled during peak processing
times. From the graph in Figure 6-2, we see that the CPU overloads peak at 10:00
a.m. and again at 3:00
p.m. Our task is to find a large
regularly scheduled SQL task that runs during these times.
STATSPACK Solutions for Finding High-Impact Tasks
Within STATSPACK, we can
examine the stats$sql_summary table, looking for SQL statements that are
executed during the peak hours. The high-impact tasks will generally be
associated with SQL statements that have a high value for rows processed. Here
is an easy STATSPACK table query to find the SQL:
rpt_top_sql.sql
L 6-8
set lines 80;
set pages 999;
set heading off;
select
to_char(snap_time,'yyyy-mm-dd hh24'),
substr(sql_text,1,50)
from
stats$sql_summary a,
stats$snapshot sn
where
a.snap_id = sn.snap_id
and
to_char(snap_time,'hh24') = 10
or
to_char(snap_time,'hh24') = 15
order by
rows_processed desc;
Here is a sample of the output from this report, showing the
beginning of all SQL statements, ordered in descending order of rows_processed.
This tells the DBA which SQL statements are causing the most I/O resources.
L 6-9
Yr. Mo dy Hr SQL_TEXT
------------- -------------------------------------------------
2001-09-20 10 begin :retCode := toc_maint . toc_insert_entry ( :
2001-09-20 10 begin :retCode := toc_maint . toc_insert_entry ( :
2001-09-20 15 INSERT INTO TOC_ENTRY ( ISBN,TOC_SEQ_NBR,VISUAL_PA
2001-09-20 15 INSERT INTO TOC_ENTRY ( ISBN,TOC_SEQ_NBR,VISUAL_PA
2001-09-20 15 SELECT PAGE_SEQ_NBR FROM PAGE WHERE (ISBN = :b1
2001-09-20 15 SELECT PAGE_SEQ_NBR FROM PAGE WHERE (ISBN = :b1
2001-09-21 10 select 'ALTER ' || substr(object_type,1,20) || ' '
2001-09-21 10 DECLARE job BINARY_INTEGER := :job; next_date DATE
2001-09-21 10 DECLARE job BINARY_INTEGER := :job; next_date DATE
2001-09-21 10 DECLARE job BINARY_INTEGER := :job; next_date DATE
2001-09-20 15 SELECT IMAGE_BLOB FROM PAGE_IMAGE WHERE (ISBN =
2001-09-20 15 SELECT IMAGE_BLOB FROM PAGE_IMAGE WHERE (ISBN =
2001-09-20 15 begin pageimages_curs . page_get_image ( :myisbn:i
2001-09-20 15 SELECT IMAGE_BLOB FROM PAGE_IMAGE WHERE (ISBN =
2001-09-20 15 BEGIN sys.dbms_ijob.remove(:job); END;
2001-09-20 15 begin pageimages_curs . page_get_image ( :myisbn:i
2001-09-20 15 BEGIN sys.dbms_ijob.remove(:job); END;
If we are diligent, we
can locate the online or batch SQL tasks that are overwhelming the server. Once
we reschedule the offending tasks to run during off-peak hours, our average run
queue for the server falls below the number of CPUs, as shown in Figure 6-4.
Figure 6-31: Average Oracle server run queue values after
task load balancing
Of course, load balancing Oracle tasks is often not as
trivial as it might appear. The Oracle DBA generally has no control over when
members of the end-user community submit batch-oriented work against the Oracle
database. Unless you are using scheduling tools such as Oracle's Concurrent
Manager, the end users are free to submit large resource-intensive reports
against Oracle anytime they feel like it. In shops where end users generate ad
hoc SQL statements, there are three Oracle features that can govern end-user
query resources:
-
The dbms_resource_manager
PL/SQL package can be used to govern end-user resources. In the following
example, dbms_resource_manager is called to assign CPU, SQL plan, and Parallel
Query options for the ADHOC_Group of end users:
SQL>
execute dbms_resource_manager.update_plan_directive(
plan => 'SINGLE_LEVEL_PLAN',
group_or_subplan => 'ADHOC_Group',
new_comment => 'ADHOC day users sessions at level 1',
new_cpu_p1 => 10,
new_parallel_degree_limit_p1 => 0);
-
Oracle Applications systems have a profile
feature to restrict ad hoc queries. For ad hoc users, their Oracle user ID is
associated with a group, and the group is linked to a profile that governs the
amount of server resources the end users are allowed to use.
-
SQL*Plus profiles are for
end users who execute queries from SQL*Plus.
The product_user_profile table can be used to restrict access.
Hence, it is up to the Oracle professional to become a
detective, and hunt through the library cache in order to see when high-resource
statements are being added to the Oracle system. We will discuss the techniques
for hunting for SQL in the library cache in Chapter 11.
The stats$sql_summary table is one STATSPACK table that
has a great deal of information to help you find offending SQL statements. SQL
statements within the stats$sql_summary table can be sorted according to a
number of resource utilization metrics, including rows processed, buffer gets,
disk reads, and executions.
In Chapter 11, we will also introduce a revolutionary
script called access.sql that will allow you to automatically fish SQL
from the library cache and produce meaningful reports showing the activity of
all of the SQL statements in your system.
Even though Oracle developers write most SQL, it is still
the duty of the Oracle DBA to monitor the behavior of the SQL within the system
in order to determine those SQL statements that are creating excessive load upon
the server processor. The Oracle DBA must also load balance those intensive SQL
statements by requiring the end users to submit them during less active times.
This next section will discuss a very short-term
technique that can be used in order to remedy sporadic CPU overload at times
when an upgrade to the CPUs or load balancing is not possible. Using nice to
change the dispatching priorities of Oracle tasks is a last-resort measure, and
should only be undertaken in extreme circumstances when it is critical to keep
the system running until more resources are available.
This is an excerpt from "Oracle9i
High Performance tuning with STATSPACK" by Oracle Press.