Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 
 
 

Optimizing Oracle Optimizer Statistics

Oracle Tips by Donald K. Burleson

Updated March 3, 2015

In Oracle we now see 11g extended optimizer statistics, an alternative to dynamic_sampling for estimating result set sizes.

The default "For all columns size auto"

It is wasteful to create a histogram on a column that is never referenced by a SQL statement.  The default value of  "for all columns size auto" is the Oracle default and this option will analyze histograms on what the optimizer considers to be "important columns".  If your database uses all bind variables, it is best to remove and then disable all histograms.

The optimizer does this by examining your current workload when you run the dbms_stats command, interrogating all current SQL in the library cache to see which columns might benefit from a histogram.


As a review, the CBO gathers information from many sources, and he has the lofty goal of using DBA-provided metadata to always make the "best" execution plan decision:

Gathering Optimizer Statistics

Oracle also wants you to generate system statistics and statistics on fixed tables. In this section we will look at how you do this. First we will look at gathering system statistics, and then we will look at how you gather statistics on fixed tables.

Gathering System Statistics

The optimizer depends on various inputs, and one of these is system derived information. This information includes such things as disk response times and CPU response times. Oracle Database 10g uses the dbms_stats.gather_system_stats command to generate system statistics. You generate system statistics under normal system loads. Generating the statistics is a three stage process:

* Start collecting statistics

* Run a representative load during the statistics collection

* Stop collecting statistics

Here is an example of this operation (note that the -- is a comment):

-- Run the next statement before you start running the workload
EXECUTE dbms_stats.gather_system_stats('Start');
 
-- Run a typical workload here
 
-- Run the next statement to stop gathering statistics.
EXECUTE dbms_stats.gather_system_stats('Stop');

You should only need to run this when things change on your system, such as when you add new disks, adds CPU's or if the system load changes.  

To fully appreciate dbms_stats, you need to examine each of the major directives. Let's take a close look at each directive and see how it is used to gather top-quality statistics for the cost-based SQL optimizer.

The dbms_stats options parameter

Using one of the four provided methods, this option governs the way Oracle statistics are refreshed:

* gather—Reanalyzes the whole schema

* gather empty—Only analyzes tables that have no existing statistics

* gather stale—Only reanalyzes tables with more than 10% modifications (inserts, updates, deletes).

* gather auto—Reanalyzes objects which currently have no statistics and objects with stale statistics (Using gather auto is like combining gather stale and gather empty.)

Note that both gather stale and gather auto require monitoring. If you issue the alter table xxx monitoring command, Oracle tracks changed tables with the dba_tab_modifications view, which allows you to see the exact number of inserts, updates, and deletes tracked since the last analysis of statistics.

The estimate percent option

The following estimate_percent argument is a new way to allow Oracle's dbms_stats to automatically estimate the best percentage of a segment to sample when gathering statistics:

estimate_percent => dbms_stats.auto_sample_size

You can verify the accuracy of the automatic statistics sampling by looking at the dba_tables sample_size column. It is interesting to note that Oracle chooses between 5 and 20 percent for a sample size when using automatic sampling. Remember, the better the quality of your statistics, the better the decision of the CBO.

Important!  In 11g and beyond you should not use estimate percent" and let Oracle default to the automatic sample size.  This ensures the "best" sample size with a minimum amount of time spent analyzing tables and indexes.  See these updated notes on Oracle estimate percent.

The method_opt option

The method_opt parameter for dbms_stats is very useful for refreshing statistics when the table and index data change. The method_opt parameter is also very useful for determining which columns require histograms.

In some cases, the distribution of values within an index will effect the CBOs decision to use an index versus perform a full-table scan. This happens when a where clause has a disproportional amount of values, making a full-table scan cheaper than index access.

Oracle histograms statistics can be created when you have a highly skewed index, where some values have a disproportional number of rows. In the real world, this is quite rare, and one of the most common mistakes with the CBO is the unnecessary introduction of histograms in the CBO statistics. As a general rule, histograms are used when a column's values warrant a change to the execution plan.

To aid in intelligent histogram generation, Oracle uses the method_opt parameter of dbms_stats. There are also important new options within the method_opt clause, namely skewonly, repeat and auto:

method_opt=>'for all indexed columns size skewonly'

method_opt=>'for all columns size repeat'

method_opt=>'for all columns size auto' -- the 10g default

Parallel statistics collection

Oracle allows for parallelism when collecting CBO statistics, which can greatly speed up the time required to collect statistics. A parallel statistics collection requires an SMP server with multiple CPUs.

Better execution speed

The dbms_stats utility is a great way to improve SQL execution speed. By using dbms_stats to collect top-quality statistics, the CBO will usually make an intelligent decision about the fastest way to execute any SQL query. The dbms_stats utility continues to improve and the exciting new features of automatic sample size and automatic histogram generation greatly simplify the job of the Oracle professional.


Many Oracle professional do not realize the important of having good statistics for the Oracle cost-based SQL optimizer (CBO). The CBO has received a "bum-rap" from some Oracle professionals who wonder why the CBO is generating sub-optimal execution plans for their SQL statements.

If you provide Oracle with good statistics about the schema the CBO will almost always generate an optimal execution plan. The areas of schema analysis include:

  • Object statistics - Statistics for all tables, partitions, IOTs, etc should be sampled with a "deep" and statistically valid sample size.
     
  • Critical columns - Those columns that are regularly-referenced in SQL statements that are:
     
    • Heavily skewed columns - This helps the CBO properly choose between an index range scan and a full table scan
       
    • Foreign key columns - For n-way table joins, the CBO needs to determine the optimal table join order and knowing the cardinality of the intermediate results sets is critical.
       
  • External statistics - Oracle will sample the CPU cost and I/O cost during statistics collection and use this information to determine the optimal execution plan, based on your optimizer_mode. External statistics are most useful for SQL running in the all_rows optimizer mode.

Frequency of re-analyze

It astonishes me how many shops prohibit any un-approved production changes and yet re-analyze schema stats weekly. Evidently, they do not understand that the purpose of schema re-analysis is to change their production SQL execution plans, and they act surprised when performance changes!

Most Oracle experts only recommend scheduled re-analysis for highly dynamic databases, and most shops save one very-deep sample (with histograms), storing the statistic with the dbms_stats.export_schema_stats procedure. The only exceptions are highly-volatile systems (i.e. lab research systems) where a table is huge one-day and small the next.

For periodic re-analysis, many shops us the table "monitoring" option and also method_opt "auto" after they are confident that all histograms are in-place.

Saving and re-using stats

For data warehouses and database using the all_rows optimizer_mode, Oracle9i release 2 has the ability to collect the external cpu_cost and io_cost metrics. The ability to save and re-use schema statistics is important for several types of Oracle shops:

  • Bi-Modal shops - Many shops get huge benefits from using two sets of stats, one for OLTP (daytime), and another for batch (evening jobs).
     
  • Test databases - Many Oracle professionals will export their production statistics into the development instances so that the test execution plans more closely resemble the production database.

Getting top-quality stats

Because Oracle schema statistics work best with external system load, I like to schedule a valid sample (using dbms_stats.auto_sample_size) during regular working hours. For example, here we refresh statistics using the "auto" option which works with the table monitoring facility to only re-analyze those Oracle tables that have experienced more than a 10% change in row content:

begin
   dbms_stats.gather_schema_stats(
      ownname          => 'SCOTT', 
      estimate_percent => dbms_stats.auto_sample_size, 
      method_opt       => 'for all columns size auto', 
      degree           => 7
   );
end;
/

Tips for optimizing the CBO with statistics

There are several tips for optimizing your CBO with good statistics:

  • Find skewed columns that are referenced in SQL - Many shops do not use method_opt=skewonly and suffer from poor execution plans on skewed column access.
     
  • Find histograms for foreign key columns - Many DBAs forget that the CBO must have foreign-key histograms in order to determine the optimal table join order (i.e. the ORDERED hint).
     
  • Fix the cause, not the symptom - For example, whenever I see a sub-optimal order for table joins, I resist the temptation to add the ORDERED hint, and instead create histograms on the foreign keys of the join to force the CBO to make the best decision.

For new features, explore the Oracle10g automatic histogram collection mechanism that interrogates v$sql_plan to see where the foreign keys are used. It claims to generate histograms when appropriate, all automatically. 


Oracle guru Guy Harrison also offers this advice for 11g statistics collection:  

"In 11g, I think there are two other ways to get statistics collected for indexed expressions:

1) Collect extended statistics directly on the expression. So for instance, if we had a function SALES_CATEGORY, we might do this:

DBMS_STATS.gather_table_stats
   (ownname => USER,
    tabname => ‘SALES',
    method_opt => ‘FOR ALL COLUMNS FOR COLUMNS
   (sale_category(amount_sold))' );

2) Create a virtual column on the expression, then index that column. So for the same example as above we might create the following virtual column, then index the column and collect stats as usual:

ALTER TABLE
   SALES
ADD
   sales_category
GENERATED ALWAYS AS
   (sale_category(amount_sold));

I think I like the first method better, because the statistics will still exist even if the index is dropped and – unlike the second approach – it doesn't change the logical structure of the table.

 

Arup Nanda has a great article on extended statistics with dbms_stats, specialty histogram analysis using function-based columnar data:

Next, re-gather statistics on the table and collect the extended statistics on the expression upper(cust_name).

begin
  dbms_stats.gather_table_stats (
     ownname    => 'ARUP',
     tabname    => 'CUSTOMERS',
     method_opt => 'for all columns size skewonly for columns (upper(cust_name))'
  );
end;
 

Alternatively you can define the column group as part of the gather statistics command. You do that by placing these columns in the method_opt parameter of the gather_table_stats procedure in dbms_stats as shown below:
 

begin
   dbms_stats.gather_table_stats (
      ownname         => 'ARUP',
      tabname         => 'BOOKINGS',
      estimate_percent=> 100,
      method_opt  => 'FOR ALL COLUMNS SIZE SKEWONLY FOR COLUMNS(HOTEL_ID,RATE_CATEGORY)',
       cascade         => true

 

 

��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2020

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.