Mark Rittman

How Do I Combine Relational And Multidimensional Data In A Single OLAP Cube?

"Everytime I do a web search for anything to do with OLAP, I always seem to be directed to your web-site!

So I have a query for you, having viewed your presentation on BI Beans. The Oracle9i Release 2 OLAP option appears to allow us to store both relational and multi-dimensional OLAP data within the Oracle database. Good news. And typically we may want to store raw data in the relational element, and aggregated data in the multi-dimensional. When a user is using a BI Beans application or Discoverer, they may not want to know about the subtlety of which element the data is stored in, but just to be able to drill down through the data, maybe starting with the aggregates, but having to move to the relational raw data when drilling to the lowest level. Have you come across a way to handle this in BI Beans and Discoverer?"

The ability to present a layer of abstraction above either relational or multidimensional data held in the database has always been presented as one of the major strengths of Oracle OLAP. By describing OLAP data using general terms such as 'cubes', 'dimensions' and 'measures', query tools such as Oracle Discoverer or BI Beans can refer to data in the OLAP Catalog without needing to know in what manner that data is actually physically stored.

In this instance, what we're trying to achieve is a situation where the base data is held at its lowest level in Oracle relational tables, and with an analytic workspace then being used to aggregate and store data at summarised levels. This is in contrast to the traditional method of summarising data and storing it, which is either carried out using materialized views or through separately maintained and populated summary tables. A tool like Discoverer Administrator can set up this sort of arrangement with a click of a few buttons, but it doesn't use analytic workspaces, which is what we're trying to achieve.

An Oracle OLAP cube, and its associated measures can, as far as I'm aware, only be sourced at present from a single table or (materialized) view in the database. This table can in fact be a virtual table defined using the OLAP_TABLE function, with its data provided from variables and dimensions in an analytic workspace, and in most cases the base data in the cube will need to be summarised by using the GROUP BY, or GROUP BY ... ROLLUP feature in SQL. If the base data for the cube comes from data held in relational tables, the GROUP BY or GROUP BY ... ROLLUP feature can be rewritten using Query Rewrite to use a materialized view, or if the base data comes from an analytic workspace, the summary data will most probably have been preaggregated and also stored in the analytic workspace.

If, in our case, we want our base data to be held in relational tables, and our summaries held in an analytic workspace, we'd first need to expose the analytic workspace summary through using SQL views via the OLAP_TABLE function. The problem is, that the query rewrite mechanism won't be able to make the connection between the GROUP BY / GROUP BY ... ROLLUP used to request the summary on the relational data, and the SQL used to expose the analytic workspace data through the OLAP_TABLE function. It'll never qualify for query rewrite as the SQL defining the base data and the analytic workspace will never match.

What this means then, is that whilst we can define a common view over our relational data, and our analytic workspace data, using the same definition of measures, cubes and dimensions, all we can ever do is swap out the relational cube for the multidimensional cube when providing a data source for the BI Beans report - we can't use one part for the base view and one part for the summary.

The good news, however, is that this is set to change with Oracle 10g OLAP. According to a recent paper by Bud Endress at Oracleworld, Oracle 10g will come with a new feature called 'Query Equivalence'. According to Bud's paper;

"In Oracle10g a new feature, query equivalence, allows query rewrite to be used with views. With query equivalence, the DBA indicates to the database what SQL could have been used to create the view even if the view was created in some other way. For example, if the application likes to emit SQL with SUM … GROUP BY but the view was created with entirely different SQL, the DBA could indicated that the view is equivalent to SUM … GROUP BY. This feature of the database is extremely useful with the OLAP option since SQL access is always through views. The DBA can create a view over an analytic workspace with syntax such as:

SELECT TIME, PRODUCT, CUSTOMER, SALES
FROM OLAP_TABLE …

And indicate to the database that the view is equivalent to:

SELECT TIME.TIME, PRODUCT.PRODUCT, CUSTOMER.CUSTOMER, SUM(FACT.SALES) …
GROUP BY …

If the application issues a query that is consistent with the equivalence of the view, such as the example below, the query will be automatically rewritten to the view over the analytic workspace.

SELECT TIME.TIME, PRODUCT.PRODUCT, CUSTOMER.CUSTOMER, SUM(FACT.SALES) …
GROUP BY …

This provides the DBA and application with benefits similar to those of materialized views – simplified maintenance and improved query performance."

It looks like this new 10g feature is exactly what is needed. By using Query Equivalence, the base data for our cube can be stored in relational tables, whilst the summary data is held in an analytic workspace. Query Rewrite will work as required, and users do not need to worry about whether data is stored relationally or multidimensionally.

��