Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 

 


"At What Levels Should We Aggregate An Analytic Workspace?"
June 29, 2005
Mark Rittman

"Are there any sensible rules we should apply when choosing aggregation levels. e.g. should you always aggregate at the top and bottom levels? Should you only aggregate at the levels that give you the most savings (e.g. aggregating from period to year should in theory reduce by a factor of 12)? Should we only consider aggregation at levels that support the current reporting requirements. I guess we don't go crazy and aggregate at all possible levels?"

When you load base-level data into an analytic workspace, if you want data to appear at higher levels of aggregation you have to have put an aggregation plan in place. In this respect analytic workspaces are a bit different to relational tables - if you want to sum up your sales figures to the regional and product category levels, you just SUM the measures in the fact table and do a GROUP BY to aggregate your data on the fly. With analytic workspaces, data is only normally only loaded in at the bottom level of a dimension hierarchy, and all the cells in the measure that hold higher level data are empty until you specify and execute an aggregation plan.

According to the Oracle By Example tutorial that comes with the Global Sample Schema,

"In analytic workspaces, aggregate data can be generated at two distinct times:

  • On the fly in response to a query. Calculated values may be cached for use throughout the session, but they are not shared among sessions.
  • As part of the build procedure. Calculated values are stored as part of the analytic workspace and shared by all sessions.

If your dimensions have multiple hierarchies or if the hierarchies have many levels, then fully aggregating the measures can increase the size of your analytic workspace (and thus your database) geometrically. At the same time, much of the intermediate level data may be accessed infrequently or not at all.

The most effective method of summarizing data in an analytic workspace is by storing some aggregates and calculating others on the fly. A typical strategy for doing this is called skip-level aggregation, because some levels are stored and others are skipped until runtime."

The question of course is which levels to aggregate, and which ones to skip. According to the tutorial:

"A good strategy for identifying levels for pre-aggregation is to determine the ratio of dimension members at each level, and to keep the ratio of members to be rolled up on the fly at approximately 10:1. This ratio assures that all answer sets can be returned quickly. Either the data is stored in the analytic workspace, or it can be calculated by rolling up 10 values at a time into each aggregate value.

This 10:1 rule is best applied with some judgment. You might want to permit a higher ratio for levels that you know are seldom accessed. Or you might want to pre-calculate levels at a lower ratio if you know they have heavy use.

Slower varying dimensions take longer to aggregate because the data is scattered throughout its storage space. If you are optimizing for data maintenance, then fully aggregate the faster varying dimensions and use skip-level aggregation on the slower varying dimensions."

Which leaves the question as to whether you should always aggregate the top and bottom levels. In terms of the top level, in one respect it depends on whether it meets the criteria that would cause any other level to be pre-aggregated: is it frequently accessed, and does it have sufficient children to warrant performing the aggregation in advance. One thing to bear in mind here though is whether your reports all default to the top level of the hierarchy when they are first loaded - if they do, then it might be worth pre-aggregating at the top level anyway, just to ensure that reports all come up straight away and there is no perceived delay in bringing up your reporting application.

With regard to the bottom level, the reason you might (or might not) specify this level of aggregation is when you are considering aggregation across all the levels in a dimension. When you think of it this way - what combinations of dimension levels do you want to pre-summarize at - then you will need to consider the base levels in any aggregation plan, as you might want to presummarise your data at the product group, customer region and month level combination if this would speed up a common query. The important thing to realize here is that we're not talking about summarizing your sales measure by time, or products, or customers, in isolation - if we were, pre-summarizing at the base level would be pointless, as there'd be nothing to pre-summarize - we're talking about the "matrix" of all possible dimension level combinations, and clearly in this case we might want to pre-summarize sales by product category, country (both of which are higher levels in their respective dimensions) and month (which in our case is the lowest level in the time dimension).


 

 
��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational