 |
|
Oracle SQL tuning with cardinality estimates
Oracle Tips by Burleson Consulting
December 3, 2010 |
SQL Tuning with
cardinality feedback
The central problem with cardinality estimation is the in cases
of complex WHERE clauses the optimizer does not have enough information
about inter-join result set sizes to determine the optimal table join order.
Also see these important notes on
SQL tuning by
adding column histograms. Histograms can improve the guesses made
by the optimizer when a column has an un-even distribution of values.
No Optimizer can always get the correct cardinality
Even with the best optimizer statistics, even Einstein could not predict
the resulting rowset size from a complex calculation:
-- impossible to predict
cardinality of the number of rows returned
where
credit_rating *
extended_credit > .07
and
(qty_in_stock
* velocity) /.075 < 30
or
(sku_price
/ 47) * (qty_in_stock / velocity) > 47;
Remember, cardinality is a fancy word for “number of rows returned” by a
SQL operation. Here is an example of the cardinality that is displayed
as part of an execution plan (the “card” column):
Execution Plan
----------------------------------------------------------
0
SELECT STATEMENT Optimizer=CHOOSE (Cost=2871 Card=2
Bytes=143)
1
0 UNION-ALL
2
1 SORT
(GROUP BY) (Cost=2003 Card=1 Bytes=59)
3
2
FILTER
4
3
HASH JOIN (Cost=1999 Card=1 Bytes=59)
5
4
INDEX (FAST FULL SCAN) OF 'XIN8OPS_FLT_LEG' (UNIQUE)
6
4
INDEX (RANGE SCAN) OF 'XIN3BAG_TAG_FLT_LEG' (UNIQUE)
7
1
SORT (GROUP BY) (Cost=868 Card=1 Bytes=84)
8
7
FILTER
9
8
NESTED LOOPS (Cost=864 Card=1 Bytes=84)
10 9
HASH JOIN (Cost=862 Card=1 Bytes=57)
11 10
INDEX (FAST FULL SCAN) OF 'XIN1SCHED_FLT_LEG' (UNIQUE)
12 10
INDEX (FAST FULL SCAN) OF 'XIN8OPS_FLT_LEG' (UNIQUE)
13 9
INDEX (RANGE SCAN) OF 'XIN2BAG_TAG_FLT_LEG' (UNIQUE)
Execution Plan
----------------------------------------------------------
0 SELECT
STATEMENT Optimizer=CHOOSE (Cost=3 Card=100
Bytes=8200)
1 0 FILTER
2 1 FILTER
3 2 HASH JOIN (OUTER)
4 3
TABLE ACCESS (FULL) OF 'BOOK' (Cost=1 Card=20
Bytes=1280)
5 3 TABLE ACCESS (FULL) OF 'SALES' (Cost=1
Card=100 Bytes=1800)
When
you catch the optimizer making a poor judgment you can reverse engineer the
condition:
- Gather the “real” row count. Note that you DO NOT
use the dba_tables.num_rows column because this is only accurate to
the last dbms_stats analyze. To read the steps of an execution plan
in-order, see
my notes on reading an execution plan in order of execution.
- Look at the metadata that might
influence the optimizers decision:
dba_tables.blocks
dba_tavble_num_rows
dba_tables.avg_row_leb
dba_tablespaces.blocksize
dba_histograms
dba_indexes
A single bad estimate of cardinality can cascade forward, causing large
amounts of intermediate row baggage to be passed through subsequent t table
joins, causing slow performance.
In this example, the four-way table join only
returns 18 rows, but the query carries 9,000
rows in intermediate result sets, slowing-down
the SQL execution speed:

Sub-optimal intermediate row sets.
If we were somehow able to predict the sizes of
the intermediate results, we can re-sequence the
table-join order to carry less intermediate
baggage during the four-way table join, in this
example carrying only 3,000 intermediate rows
between the table joins:

Optimal intermediate row sets.
Oracle histograms and cardinality
Histograms are used to predict cardinality and the
number of rows returned to a query. Let's assume that we
have a vehicle_type index and that 65 percent of
the values are for the CAR type. Whenever a query with
where vehicle_type = 'CAR' is specified, a
full-table scan would be the fastest execution plan,
while a query with where vehicle_type = 'TRUCK'
would be faster when using access via an index.
Histograms affect performance and should only be used
when they are required for a faster CBO execution plan.
They incur additional overhead during the parsing phase
of an SQL query. Histograms can be used effectively only
when:
- A table column is referenced in one or more
queries: Never create histograms if queries don't
reference the column. Novice DBAs may mistakenly
create histograms on a skewed column, even if it's not
referenced in a query.
- A column's values cause the CBO to make an
incorrect guess: If the CBO makes an incorrect
assumption regarding the size of an intermediate
result set, it may choose a sub-optimal execution
plan. A histogram added to the column often provides
the additional information required for the CBO to
choose the best plan.
- Significant skewing exists in the distribution
of a column's data values: The skew must of course,
be significant enough to make the CBO choose a
different execution plan.
Tools to assist in Optimizer cardinality estimation
There are many tools to assist with SQL
tuning, but the best tools will expose all of the
internal metrics of the data dictionary. The Ion
tool does a great job at aiding SQL
tuning:

Ion screen for SQL tuning
The Ion
tool is an easy way to analyze Oracle SQL performance and Ion also allows you to
spot hidden SQL performance trends.
Related research on SQL tuning with
cardinality feedback
Also
see these related notes on cardinality estimation:
-
-
-
-
|
|
|
|
Guarantee your Success!
Oracle is the
world's most complex, robust and flexible database, considered
impossible to master without a mentor.
That's why all BC
Oracle trainers are working professionals, experts in Oracle who
share their tips and secrets. |
|