Home
E-mail Us
Oracle Articles
New Oracle Articles

Oracle Training
Oracle Tips
Oracle Forum
Class Catalog

Remote DBA
Oracle Tuning
Emergency 911
RAC Support
Apps Support
Analysis
Design
Implementation
Oracle Support

SQL Tuning
Security
Oracle UNIX
Oracle Linux
Monitoring
Remote support
Remote plans
Remote services
Application Server
Applications
Oracle Forms
Oracle Portal
App Upgrades
SQL Server
Oracle Concepts
Software Support
Remote Support
Development
Implementation

Consulting Staff
Consulting Prices
Help Wanted!

Oracle Posters
Oracle Books
Oracle Scripts
Ion
Excel-DB

Don Burleson Blog

Cost Control: Inside the Oracle Optimizer

Oracle Tips by Burleson Consulting

2008 Updates:

Oracle 10g migration upgrade performance.
10g changes to dbms_stats
11g extended optimizer statistics
Oracle 11g changes to dbms_stats

This article has the following sections:

Part 1 - Using SQL CBO Parameters
Part 2 - Using CBO statistics
Part 3 - Proper SQL development environment
Part 4 - Using histograms to tune SQL
Part 5 - Clustering and SQL tuning
Part 6 - External costs and SQL execution
Part 7 - Using hints to tune SQL
Part 8 - Locating sub-optimal SQL

PART 4 - Considering the CBO

While we have gone into great detail on the CBO, there is always more to learn as the CBO becomes more powerful (and complex) with each new release of Oracle. The main points of this article include general guidelines for adjusting the behavior of the CBO:

The DBA can control the overall behavior of the CBO with several Oracle parameters, but they should only be changed under limited circumstances.
The CBO relies on statistics to determine the optimal execution plan for SQL statements, and statistics should always be collected with the dbms_stats package.
An important job of the Oracle DBA is the collection and management of statistics for the CBO. CBO statistics can be collected, stored, and migrated to other related instances to ensure consistency of execution plans.
Re-analyzing schema statistics without exporting the old statistics (using export_system_stats) is dangerous because the execution plans for thousands of SQL statements may change, and you cannot get back to previous SQL performance. Re-analyze a schema only when there are significant changes to the data.

In Part 1 of the series, we discussed the basic mechanisms of Oracle's cost-based SQL optimizer for making the best decisions about the access paths to data. In this concluding installment, we'll address the use of histograms, external costing features, SQL hints for changing execution plans, and techniques for locating and tuning suboptimal SQL.

Using Histograms

In some cases, the distribution of values within a column of a table will affect the optimizer's decision to use an index vs. perform a full-table scan. This scenario occurs when the value with a where clause has a disproportional amount of values, making a full-table scan cheaper than index access.

A column histogram should only be created when we have data skew exists or is suspected. In the real world, that happens rarely, and one of the most common mistakes with the optimizer is the unnecessary introduction of histograms into optimizer statistics. The histograms signals the optimizer that the column is not linearly distributed, and the optimizer will peek into the literal value in the SQL where clause and compare that value to the histogram buckets in the histogram statistics (see Figure 3).

Many Oracle professionals misunderstand the purpose of histograms. While they are used to make a yes-or-no decision about the use of an index to access the table, histograms are most commonly used to predict the size of the intermediate result set from a multi-way table join.

For example, assume that we have a five-way table join whose result set will be only 10 rows. Oracle will want to join the tables together in such a way as to make the result set (cardinality) of the first join as small as possible. By carrying less baggage in the intermediate result sets, the query will run faster. To minimize intermediate results, the optimizer attempts to estimate the cardinality of each result set during the parse phase of SQL execution. Having histograms on skewed column will greatly aid the optimizer in making a proper decision. (Remember, you can create a histogram even if the column does not have an index and does not participate as a join key.)

Because a complex schema might have tens of thousands of columns, it is impractical to evaluate each column for skew and thus Oracle provides an automated method for building histograms as part of the dbms_stats utility. By using the method_opt=>'for all columns size skewonly' option of dbms_stats, you can direct Oracle to automatically create histograms for those columns whose values are heavily skewed. We'll take a look at this option in more detail later.

As a general rule, histograms are used to predict the cardinality and the number of rows returned in the result set. For example, assume that we have a product_type index and 70% of the values are for the HARDWARE type. Whenever SQL with where product_type='HARDWARE'is specified, a full-table scan is the fastest execution plan, while a query with where product_type='SOFTWARE' would be fastest using index access.

Because histograms add additional overhead to the parsing phase of SQL, you should avoid them unless they are required for a faster optimizer execution plan. But there are several conditions where creating histograms is advised:

When the column is referenced in a query — Remember, there is no point in creating histograms if the queries do not reference the column. This mistake is common, and many DBAs will create histograms on a skewed column, even though it is not referenced by any queries.
When there is a significant skew in the distribution of columns values —This skew should be sufficiently significant that the value in the WHERE clause will make the optimizer choose a different execution plan.
When the column values cause an incorrect assumption — If the optimizer makes an incorrect guess about the size of an intermediate result set it may choose a sub-optimal table join method. Adding a histogram to this column will often provide the information required for the optimizer to use the best join method.

So how do we find those columns that are appropriate for histograms? One exciting feature of dbms_stats is the ability to automatically look for columns that should have histograms, and create the histograms. Again, remember that multi-bucket histograms add a huge parsing overhead to SQL statements, and histograms should only be used when the SQL will choose a different execution plan based upon the column value.

To aid in intelligent histogram generation, Oracle uses the method_opt parameter of dbms_stats. There are also important new options within the method_opt clause, namely skewonly, repeat and auto.

      method_opt=>'for all columns size skewonly'
      method_opt=>'for all columns size repeat'
      method_opt=>'for all columns size auto'

Let's take a close look at each method option.

The first is the "skewonly" option, which is very time-intensive because it examines the distribution of values for every column within every index. If dbms_stats discovers an index with columns that are unevenly distributed, it will create histograms for that index to aid the cost-based SQL optimizer in making a decision about index vs. full-table scan access. For example, if an index has one column that is in 50% of the rows, a full-table scan is faster than and index scan to retrieve these rows.

Histograms are also used with SQL that has bind variables and SQL with cursor_sharing enabled. In these cases, the optimizer determines if the column value could affect the execution plan, and if so, replaced the bind variable with a literal and performs a hard parse:

begin
 dbms_stats.gather_schema_stats(
   ownname          => 'SCOTT', 
   estimate_percent => dbms_stats.auto_sample_size, 
   method_opt       => 'for all columns size skewonly', 
   degree           => 7
 );
end;
/

The auto option is used when monitoring is implemented (alter table xxx monitoring;) and creates histograms based upon data distribution and the manner in which the column is accessed by the application (e.g. the workload on the column as determined by monitoring). Using method_opt=>'auto'is similar to using the gather auto in the option parameter of dbms_stats:

begin
   dbms_stats.gather_schema_stats(
      ownname          => 'SCOTT', 
      estimate_percent => dbms_stats.auto_sample_size, 
      method_opt       => 'for all columns size auto', 
      degree           => 7
   );
end;
/

Our tools:

Our Ion tool is the easiest way to analyze Oracle performance and Ion allows you to spot hidden performance trends.

Get the Complete
Oracle SQL Tuning Information

The landmark book "Advanced Oracle SQL Tuning The Definitive Reference" is filled with valuable information on Oracle SQL Tuning. This book includes scripts and tools to hypercharge Oracle 11g performance and you can buy it for 30% off directly from the publisher.

��