The execution path for any SQL statement is only as
good as the underlying statistics and it's been long understood that histograms
are the solution.
It's very difficult for any SQL optimizer to accurately
predict the cardinality of an operation, and the problem is aggravated by
queries that have complex WHERE clause predicates. The goal of any SQL
optimizer is to join the tables together with the proper ?driving table?, such
that the first join has the smallest possible result set size (cardinality),
resulting in less baggage that must be passed-on to subsequent table joins.
This is especially problematic for Oracle queries that join
many tables together, and DBA's now understand that optimal table join order is
not automatic. Instead, the DBA is forced to perform complex manual tuning,
examining popular SQL statements and applying histograms as needed to ensure
that the 11g CBO joins the tables together in an optimal fashion.
In the real world, there exists one, and only one, optimal
table join order, and rather than undertake a time-consuming exercise in
histogram generation, they lock-down the table join orders with the ORDERED hint
or by using SQL profiles.
Optimization requires historical SQL analysis
This issue is not unique to Oracle. Anticipating the inter
join cardinality is extremely complex, and the problem can only be solved by
correlating the sub-optimal SQL to historical SQL statements:
- Histograms impose overhead - The presence of a
histogram creates overhead for the CBO, and histograms cannot be applied
solely by examining the table and index data. Rather, histograms should
only be created when SQL statements need them, and this requires complex
analysis of historical SQL using STATSPACK or AWR tables.
- Intelligent histograms placement is time-consuming - Proper application of histograms requires careful analysis, and many
production DBA's do not have the time to correlate historical SQL with CBO
statistics.
In an
RMOUG 2015 paper titled ?How Sampling Error Impacts Execution Plans: The
Effects of Estimating Optimizer Statistics?, David Lipowitz, notes that
histograms are critical to making the CBO choose an optimal table join order:
'the key lies in how the CBO
calculates join cardinalities, which is Oracle's expectation of how many records
will be produced when multiple tables are joined together.
The RDBMS assumes an even
distribution of values in the absence of a histogram, and because of how we
populated this table we know this is an accurate assumption.?
Lipowitz also notes that the CBO statistics collection
mechanism does not yet examine SQL workloads, a critical factor that now must be
done manually by the DBA:
'the optimizer's algorithm, using
initialization parameters, system statistics, and most likely a variety of other
inputs, essentially decides on a cardinality ahead of time that will allow the
index-driven path to be used.
When that threshold is exceeded
by a join cardinality based on the right combination of NUM_DISTINCT and
NUM_ROWS, a full scan appears in the execution plan every time.
The probability of exceeding that
threshold, given the specific statistics in the data dictionary, is all that
varies; the threshold itself appears constant.?
Conclusions on CBO errors
Making the Oracle optimizer always choose the best
execution plan is a phenomenal software engineering challenge, and the complex
nature of the problem suggests that it may be impossible to ever create a SQL
optimizer that never makes mistakes. Until Oracle starts to leverage the
historical data in the Automated Workload Repository (AWR), the CBO will never
be able to properly add histograms.
Far from the hype of 11g having ?fully automated SQL
tuning?, Oracle has a long way to go in automating the generation of the
metadata statistics that are required to generate optimal execution plans for
any SQL statement. In the meantime, Oracle professionals will be forced to use
techniques such as hints and adjusting optimizer parameters to overcome for this
inherent problem.
Despite the inherent complexity, the good news is that it
is possible to intelligently examine your historical SQL and table column
distributions and add histograms to improve execution plans. Burleson
Consulting has solved this problem and can add histograms to improve SQL
execution plans.
 |
If you like Oracle tuning, you
might enjoy my book "Oracle
Tuning: The Definitive Reference", with 950 pages of tuning tips and
scripts.
You can buy it direct from the publisher for 30%-off and get instant
access to the code depot of Oracle tuning scripts. |