|
 |
|
Tracking Oracle hash joins
Oracle Tips by Burleson Consulting |
Interrogating table join methods
The choice between a hash join and a nested loop join depends on
several factors:
§
The relative number of rows in each table
§
The presence of indexes on the key values
§
The settings for static parameters such as
index_caching and
cpu_costing
§
The current setting and available memory in
pga_aggregate_target
Hash joins do not use indexes and perform full-table scans often
using parallel query. Hence, the use of hash joins with parallel
full-table scans tend to drive-up CPU consumption.
Also, PGA memory consumption becomes higher when hash joins are
used, but if AMM is enabled, it is not usually a a problem.
The following query produces a report alerting an Oracle DBA when
hash join operations count exceeds some threshold (See
book errata for details):
§
awr_hash_join_alert.sql
col c1 heading ‘Date’ format a20
col c2 heading ‘Hash|Join|Count’ format 99,999,999
col c3 heading ‘Rows|Processed’ format 99,999,999
col c4 heading ‘Disk|Reads’ format 99,999,999
col c5 heading ‘CPU|Time’ format 99,999,999
accept hash_thr char prompt ‘Enter Hash Join Threshold: ‘
ttitle ‘Hash Join Threshold|&hash_thr’
select
to_char(
sn.begin_interval_time,
'yy-mm-dd
hh24'
)
snap_time,
count(*)
ct,
sum(st.rows_processed_delta)
row_ct,
sum(st.disk_reads_delta)
disk,
sum(st.cpu_time_delta)
cpu
from
dba_hist_snapshot
sn,
dba_hist_sqlstat
st,
dba_hist_sql_plan
sp
where
st.snap_id
= sn.snap_id
and
st.dbid = sn.dbid
and
st.instance_number = sn.instance_number
and
sp.sql_id = st.sql_id
and
sp.dbid = st.dbid
and
sp.plan_hash_value = st.plan_hash_value
and
sp.operation = 'HASH JOIN'
group
by
to_char(sn.begin_interval_time,'yy-mm-dd
hh24')
having
count(*) > &hash_thr;
SEE CODE DEPOT FOR FULL SCRIPTS
The sample output might look the following, showing the number of
hash joins during the snapshot period along with the relative I/O
and CPU associated with the processing. The values for
rows_processed are
generally higher for hash joins which do full-table scans as opposed
to nested loop joins with generally involved a very small set of
returned rows.
Hash Join Thresholds
Hash
Join Rows Disk CPU
Date Count Processed Reads Time
-------------------- ----------- ----------- -----------
04-10-12 17 22 4,646 887 39,990,515
04-10-13 16 25 2,128 827 54,746,653
04-10-14 11 21 17,368 3,049 77,297,578
04-10-21 15 60 2,805 3,299 5,041,064
04-10-22 10 25 6,864 941 4,077,524
04-10-22 13 31 11,261 2,950 46,207,733
04-10-25 16 35 46,269 1,504 6,364,414
__________________________________________________
|

|
The sorting default is that no single task may consume more
than 5% of the
pga_aggregate_target region before the
sort pages out to the TEMP tablespace for a disk sort. |
The Oracle cost-based optimizer will determine
whether a hash join would be beneficial over a nested-loop join, so
making more PGA available for hash joins will not have any
detrimental effect since the optimizer will only invoke a
super-sized hash join if it is better than a nested-loop join. In a
system like the example above, the following settings would increase
the default sizes for large sorts and hash joins while limiting
those for parallel sorts.
·
pga_aggregate_target = 4g
·
_pga_max_size = 400m
·
_smm_px_max_size = 333m
With these hidden parameters set we see
significant size increase for serial sorts and a throttling effect
for parallel queries and sorts. However, bear in mind that it only
valid for a specific release of Oracle10g, on a specific hardware
and OS environment, and not using any optional features such as the
MTS.

The
WISE tool is
also excellent for identifying SQL to tune and it can show SQL
execution over time with stunning SQL graphics.
SEE CODE DEPOT FOR FULL SCRIPTS
|
|
Need an Oracle Health Check?
- Do you have
bad performance after an upgrade?
- Need to
certify that your database follows best practices?
BC Oracle performance gurus can quickly
certify every aspect of your
Oracle database and provide a complete verification that your database
is fully optimized. |

|
|