Home
E-mail Us
Oracle Articles
New Oracle Articles

Oracle Training
Oracle Tips
Oracle Forum
Class Catalog

Remote DBA
Oracle Tuning
Emergency 911
RAC Support
Apps Support
Analysis
Design
Implementation
Oracle Support

SQL Tuning
Security
Oracle UNIX
Oracle Linux
Monitoring
Remote support
Remote plans
Remote services
Application Server
Applications
Oracle Forms
Oracle Portal
App Upgrades
SQL Server
Oracle Concepts
Software Support
Remote Support
Development
Implementation

Consulting Staff
Consulting Prices
Help Wanted!

Oracle Posters
Oracle Books
Oracle Scripts
Ion
Excel-DB

Don Burleson Blog

Oracle Data Warehouse and Online Analytical Processing (OLAP)

Oracle Data Warehouse Tips by Burleson Consulting

Online Analytical Processing And Oracle

Dr. E. F. (Ted) Codd coined the phrase online analytical processing (OLAP) in a 1993 white paper called Providing OLAP (Online Analytical Processing) to User-Analysts: An IT Mandate. Soon after the publication of this paper, OLAP became the latest buzzword in the database arena and every IS professional struggled to understand OLAP and how it fit into the paradigm of decision support systems (DSS) applications. In addition to defining OLAP, Dr. Codd also went on to create 12 rules for OLAP which are similar in form to Codd?s 12 rules for relational databases. Given the recent popularity of OLAP, it is very easy to view OLAP as a nascent technology. However, OLAP, as defined by Codd, is not a new technology, and some products, such as the IRI Express OLAP engine, have been available for more than 20 years.

As more companies embrace the concept of creating a historical data repository for OLAP and DSS applications, new client/server issues are emerging as developers struggle to create Oracle-based client/server applications. This chapter reviews OLAP with a focus on the various techniques used to interface OLAP applications with Oracle databases.

A great deal of interest has surfaced in the application of data warehousing and multidimensional databases for advanced systems. Advanced systems, such as expert systems and decision support systems have been used for decades to solve semi-structured and even unstructured problems. Traditionally, these types of systems combine inference engines and relational databases in order to store the knowledge processing components, and they have all done these tasks without the benefit of having a huge data warehouse..

OLAP News

The explosive interest in OLAP has fueled the creation of a popular new Internet newsgroup[MM89] dedicated to OLAP issues. You can find the newsgroup at comp.database.olap.

OLAP Fundamentals

Essentially, an OLAP system is any system that captures summarized information and allows the summaries to be displayed as cross-tabulations between two variables. In the following example, we explore OLAP using the Excel spreadsheet pivot table feature.

The terms OLAP and multidimensional database have become synonymous, thus adding to the confusion surrounding the two terms. Essentially, a multidimensional database is a database architecture that stores summarized information such that all salient data items (called dimensions) are cross-referenced with each other (see Figure 5.1). For example, a multidimensional database might store sales totals cross-referenced by month, product line, territory, and salesperson. The multidimensional database could then display the sales totals according to the possible combinations of cross-referenced tables shown in Table 5.1.

Fact Dimensions

Total Sales Month by product line

Total Sales Month by territory

Total Sales Month by salesperson

Total Sales Product line by territory

Total Sales Product line by salesperson

Table 5.1 Possible plots of facts and dimensions.

Figure 5.1 A sample OLAP display.

So, if this is a multidimensional database, then what is OLAP? OLAP is a presentation front end that allows end users to choose the dimensions and facts that will be cross-referenced. The data does not necessarily have to come directly from a multidimensional database. The possible data sources for an OLAP application include:

* Multidimensional databases

* Relational databases (via ROLAP tools)

* Spreadsheet data (extracted from any database architecture)

To illustrate, let?s take a close look at a simple OLAP application. Note that while the following example may seem overly simplistic, many Oracle shops are using this very technique to deliver OLAP applications to their end users. Excel spreadsheets can hold very large datasets, and in many cases, managers can get their data loaded entirely into their spreadsheet, thereby alleviating the need to create a client/server Oracle application. Instead of using a client/server application, the summary data for a spreadsheet is refreshed from Oracle each day, ensuring that the manager has current data for the OLAP application.

Using Excel Pivot Tables For OLAP

The Microsoft Excel spreadsheet product is an excellent way to demonstrate the capabilities of an OLAP application. Many companies extract de-normalized information from their Oracle databases, FTP the data to a PC LAN, and load the file into a very large Excel spreadsheet on a client PC (see Figure 5.2). The manager can then use Excel?s Pivot table wizard to analyze the data, just as if they were using a multidimensional database. Of course, this approach requires a very powerful PC, but it is a very fast, simple, and straightforward approach to simulating a multidimensional database with an OLAP application.

Figure 5.2 Extraction of summarized data for PC-based OLAP.

The Oracle data extraction is usually invoked from a Unix crontab task, which invokes an SQL script to extract the summarized data from Oracle, format the data, and pipe it to a flat file. Using NFS, a mount on a PC LAN can be created so that the extracted data is transferred directly to the PC disk. The data can then be easily loaded into a spreadsheet, where the wizard will allow the manager to perform complex cross-tabulations and decision support queries.

Note: To get the maximum benefit from this exercise, I recommend that you follow along on your PC. Simply start Microsoft Excel, and load the file called pivot.xls from the CD-ROM in the back of this book. This short exercise will only take about ten minutes, and it has proven of be the best method for understanding multidimensional databases.

Step 1: Load The Sample Spreadsheet

The first step of this exercise is to load the sample spreadsheet named pivot.xls. You can find this spreadsheet on the CD-ROM included with this book.

Please note that the input data that is provided within pivot.xls is highly de-normalized and ready to use. Each row represents a line item within an order, and there is a high degree of duplicated data. Also, note that every field is a potential key except for qty-sold and total-cost. Getting this file onto a PC platform involves writing an SQL statement that will extract the required columns while joining the salient Oracle tables (see Figure 5.3). The SQL is generally executed by invoking Oracle?s SQL*Plus from a cron task on Unix. The resulting flat file will then be transferred to a PC LAN by using FTP, where it will be loaded into the Excel spreadsheet. Figure 5.3 shows what the spreadsheet should look like.

Figure 5.3 De-normalized Oracle data after transfer into Excel spreadsheet.

Step 2: Invoke The Pivot Table Wizard

The second step in this exercise is to invoke the pivot table wizard. To accomplish this, choose Data|Pivot Table. The wizard should appear, and it will enable you to define the fact and dimension columns for the pivot table. Perform the following steps:

1. In the first wizard screen, choose Data From Excel List, and press Next.

2. Verify the data range (a1:l47), then press Next.

3. Finally, specify the axis names by highlighting your choice, and then drag and drop them onto the pivot table axis, as shown in Figure 5.4, then press Finish.

Figure 5.4 Specifying the axis for a pivot table.

Now, you should have a pivot table defined and displayed on your screen which looks similar to the one shown earlier in Figure 5.1.

Because the entire table is in addressable memory, it is very simple to change the axis of the pivot table and completely recompute the table. To accomplish this, simply press the Pivot Table Wizard button, which appears immediately above the pivot table. You can now redefine the dimensions of the pivot table and create a new representation of your data, as shown in Figures 5.6 and 5.7.

Figure 5.6 Quantity with year-quarter against region.

Figure 5.7 Quantity with region against quarter.

Now, note that we can nest dimensions to create a hierarchy of dimensions. This is a very powerful feature of multidimensional databases and relates to our discussion of attribute hierarchies presented in Chapter 4, Oracle Data Warehouse Design. Figure 5.8 shows an example of nested dimensions, with the state dimension nested within the region dimension. It is also possible to quickly change a displayed fact. For example, in Figure 5.9, the fact is changed from total sales to total cost.

Figure 5.8 Quantity with region-state against year.

Figure 5.9 Total cost with region against salesperson.

Figures 5.10 and 5.11 show how you can quickly rotate dimensions to display total sales against the customer and item number dimensions. It should be obvious that there is incredible power in this ability to change dimensions and quickly recompute facts.

Figure 5.10 Total sales with month against customer.

Figure 5.11 Total sales with month against item number.

As you can see, a pivot table gives a manager the ability to analyze information in any conceivable combination.

Even data that does not appear directly can be displayed. For example, let?s say that normally we only display the sum of cost or the sum of sales, but today, we want to drag a dimension into the fact area of the pivot table. Dragging a non-numeric data attribute into the computation area of a pivot changes the fact from a sum to a count. As you can see in Figure 5.12, the number of orders that have been placed within each city are now counted and display by month. In this fashion, and data attribute can be counted, and it is not always necessary to include only numerical items in the body of the pivot table. Figure 5.13 shows the resulting pivot table, showing the counts of the number of orders within each region by month. We could count the number of any data attributes including the number of salespersons and customers. For example, we might want to count the number of customers, plotting this count by region and by month.

Figure 5.12 Counting within a pivot table.

Figure 5.13 The resulting pivot table of counts of orders.

Now that you understand how multidimensional database front ends function for decision support applications, let?s take a look at how data warehouses fit into multidimensional database frameworks.

If you like Oracle tuning, see the book "Oracle Tuning: The Definitive Reference", with 950 pages of tuning tips and scripts.

You can buy it direct from the publisher for 30%-off and get instant access to the code depot of Oracle tuning scripts.

��