Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 

 


Testing the Incremental Load Improvements in 10gR2 OLAP - Part I
December 23, 2005
Mark Rittman

One of the questions that often comes up when I'm helping people with Oracle OLAP, is what is the most efficient way to load incremental data into a cube? Take for example a situation where you have an initial load of 8000 rows into a cube that has five dimensions, and subsequently you need to load another 2000 odd rows; do you rebuild the cube from scratch when you need to add the further 2000 rows, or do you leave the cube as it is and add the 2000 rows on top of what's already there, either as a separate load on their own or mixed in with the data that was already loaded? If you're using AWM10g, do you select the "Aggregate the full cube" option, or the "Aggregate just for incoming values" option, and if you choose the latter what benefits does it bring?

This is actually one of the questions I've addressed in the forthcoming article I've written for DBAZine on Oracle Database 10g OLAP performance tuning, but I don't think I'm giving too much away when I tell you that, with 9iR2 and 10gR1, it actually doesn't make any difference at all whether you choose to aggregate just for new values or for all values in the cube, or whether you separate out the new data and load it in on it's own, or you just reload the entire dataset, old data and new - the aggregation process takes the same time whatever. However, one of the new things promised with 10gR2 is that the loading of incremental data is much faster than with 9iR2 or 10gR1, as this excerpt from the Oracle OLAP 10gR2 PDF shows:

Quite an impressive improvement. With this in mind, I thought I'd load up the dataset that I used for the article and see what sort of improvements I could find.

The cube consisted of two measures (order quantity, ship quantity) and five dimensions (product, customer, channel, time and promotion). The source table for the measures contained 11429 rows, and I split this into two load batches, one of 8884 rows, and 2545 in the second. I then put together four scenarios:

  1. Load the first 8884 rows into the cube, select "Aggregate the full cube", then load these 8884 rows plus 2545 additional rows, again selecting "aggregate the full cube".

  2. Load the first 8884 rows as before, select "Aggregate the full cube", then load these 8884 rows plus 2545 additional rows, but this time select "aggregate for incoming values only"

  3. Load the first 8884 rows as before, then additionally load just the 2545 new rows and select "aggregate the full cube"

  4. Load the first 8884 rows as before, then additionally load just the 2545 new rows, but this time select "aggregate for incoming values only"

These scenarios were tested on 10gR1 and 10gR2 on Windows XP SP2, with 10gR2 being the developer preview that's available on OTN. 10gR2 was running in a VMWare virtual machine with 1GB of RAM allocated, whilst 10gR1 was running on the OS proper. All settings were left as the default, with no changes to SGA_TARGET, PGA_AGGREGATE_TARGET and so on.

With 10g Release 1, the results looked like this:

As I said earlier, there's no real difference between any of the load scenarios; each of them took around the same time to process the additional data. Also, the rate at which incremental data is processed is about one-fifth of the rate of the original data load - my take on this is that the load routine within Oracle OLAP has to "merge" the incoming data in with existing cells in the measures, which is a lot slower than straight "inserts". Nonetheless, there's no performance boost to be had by either selecting "aggregate incoming values only", or by separating out new data from old. Infact, if you look at the last scenario, doing a full load of all the 11k rows, you'll see that it's actually quicker to rebuild the cube from scratch than to try and add in an incremental load.

With 10g Release 2 (remember, it's running in VMWare virtual machine, so we're not strictly comparing apples and apples, and it's beta code) the results look like this:

A few points to note here:

  • The rate at which initial data is loaded is about 50% faster than 10gR1 - not bad.

  • When the new data is mixed in with the old and reloaded into the cube, 10gR2 is still faster than 10gR1, but the benefit is less than with our initial data load

  • If the new data is split out from the old before it's loaded, 10gR2 is up to twice as fast as 10gR1 for loading incremental data. Strangely, it didn't seem to matter whether I selected the "aggregate just incoming values" option or not, in fact in the tests I carried out the load was actually faster if I just selected "aggregate all values".

  • With these performance improvements, it's now worthwhile doing an incremental load, rather than rebuilding your cube from scratch each time.

I ran the tests a few times to try and get some stable / averaged out figures, but you should bear in mind that the dataset is comparatively small and the incremental load "boosting" algorithm may provide more benefits the larger the size of your cube.

So, what this says then is that there are performance improvements in 10gR2 for loading incremental data, but (from my tests, anyway) they are found when you just load just the new data into a cube, rather than reloading the entire dataset including your new data. I wasn't however able to see any performance benefit from choosing the "aggregating incoming values only" option - I seemed to get this benefit whether the option was selected or not. The performance benefit isn't huge - certainly not the 18-fold improvement that the Oracle slide suggests - but it is there, and it made the difference for me between reloading the cube from scratch each time, to putting in place a system where we perform an incremental load of just the new rows.

Anyone else had a play around with this, perhaps got closer to the improvements in the slide? Let me know if you do.


 

 
��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational