Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB   


 

 

 


 

Hierarchical attribute design for an Oracle8i warehouse
Jan 18, 2001 - Donald Burleson

In an Oracle data warehouse, hierarchies are especially difficult to maintain. Unlike finite attributes, hierarchical classifications may assign many values when an item is added to an Oracle data warehouse. The hierarchical classification displayed in Figure A shows a hierarchical attribute classification for a product.

Figure A
A sample classification hierarchy

For example, when a transaction for the purchase of chicken wings is recorded, attributes made for the transaction need to include the sale of a chicken product, a meat product, and a food product. So a mechanism needs to be devised to store the hierarchy of attributes in a meaningful way and to insert the proper values each time a row is added to the fact table.

In our example, we have the following two methods for representing our hierarchy of product attributes:

  1. Assign an arbitrary level classification for each level in the hierarchy. Because each level is mutually exclusive (i.e., the product can't be both food and nonfood), the following scheme would be the most efficient use of storage in the fact table:

level_one char(1)
CONSTRAINT level_one_check
CHECK (level_one in ('F','N');
level_two char(1)
CONSTRAINT level_two_check
CHECK (level_two in ('A','V','P','T');
level_three char(1)
CONSTRAINT level_three_check
CHECK (level_three in ('U','B','C');


In our oversimplified example, level_one can be food or nonfood; level_two can be animal, vegetable, paper, or toy; and level_three can be 'U' for unclassified (in cases where a level three classification does not apply), beef, or chicken. The advantage of this representation of the hierarchy is that it will add only 3 bytes to each fact row. The downside is that queries are cryptic for end users. End users will have to know the valid values for each classification level in their queries. For example, to query for the total sales of nonfood items by customers who earn more than $50,000 per year would require that the user know the proper flag values, as shown below:


SELECT sum(sale_amount) from FACT
WHERE
level_one = 'N'
AND
yearly_income_class > 3;

  1. Assign a descriptive name for each level in the hierarchy. This method assigns descriptive values for each product class, thereby making it easier for the end users to query the fact table. For example:

food_flag char(1)
CONSTRAINT food_check
CHECK (food_flag in ('Y','N');
animal_or_vegetable_flag char(1)
CONSTRAINT animal_check
CHECK (animal_or_vegetable_flag in ('A','V','U');
type_of_meat_flag char(1)
CONSTRAINT meat_check
CHECK (type_of_meat_flag in ('F','C','P','B');

. . .

The benefit of this approach is that the queries can be made very descriptive. For example:


SELECT *
from
FACT
WHERE
food_flag = 'N'
AND
yearly_income_class = 2;


The downside, of course, is that there will be many more flags in the fact table, and because the values are mutually exclusive at each level, most of the flags will contain NULL values. There will also be far more indexes on the fact table, and they will slow down the nightly batch update process. As data warehouses grow into the terabyte range, even a few additional bytes can have a substantial disk cost. In addition, the type of front end will influence the decision about the type of flag. If your application hides the SQL behind a front-end query tool, this type of approach has no real advantage.

Again, these physical attribute representation issues go to the very heart of the Oracle data warehouse, and intelligent up-front planning will ensure a sound, robust system.

Rollups and summaries


Oracle8i provides a wealth of tools that assist in the hierarchical classification of fact attributes, but the DBA must make special plans to accommodate these hierarchies.

You can use Oracle data warehouse tools such as Oracle Express to automatically display summary values and classification hierarchy. Oracle8i also provides materialized views that allow Oracle SQL to be automatically translated from the lowest-level facts and popped into the aggregation hierarchies. Let's take a look at how we can manually summarize information into classification three.

A supermarket needs to know in advance that prune juice belongs to the juice category, which in turn participates in the fruit category, which in turn participates in the edible food groups category. Once these facts are established, we can write Oracle SQL that will read classification definitions and summarize the information based upon the unique idea of the item within the supermarket.

For example, if we assign a distinct internal category for each SKU, a lookup table might look like this:


CREATE TABLE CATEGORIES
(
category_name char(80),
category_number number
)

Figure B shows how the table entries would appear.

Figure B

Now that we have classified each supermarket item with a unique category, the classification definition three might look something like this:


CREATE TABLE category_cross_reference
(
has_category number,
is_a_category number
)


This table represents a recursive many-to-many relationship between categories. This means that any single category may have categories and at the same time participate as a smaller unit of a larger category. For example, the juice category has subcategories, bottled juices and frozen juices. At the same time, the juice category participates in the larger fruit category.

Figure C shows what the rows in this table would look like.

Figure C



From this table, we can re-create the classification hierarchy for any items we require. For example, we can use the following SQL to find all items within the Fruit category:


SELECT
Has_categories
FROM
Category_cross_reference a,
Category_cross_reference b
WHERE
a.has_categories = b.is_a_category
AND
a.is_a_category = 'Fruit';


This is an interesting SQL query because it joins the table against itself. In this fashion, we can "walk" the recursive relationships in two directions. We can find all categories beneath the specified category, or we can display all items above the specified category.

As we can see from this SQL, each classification is preprogrammed with its relationships to other categories, such that the Oracle SQL will be able to instantly determine where an individual item belongs within the classification hierarchy.

Taking this table, we can write an Oracle SQL summary statement that will automatically compute the sum of all sales within each category. For example, the SQL below can be used to compute the sales of all juice products within the supermarket. Furthermore, if we know that people with certain buyer demographics are interested in juice products, queries such as this can be used for target marketing of these products.


CREATE TABLE
Summary_juice_sales
AS
SELECT
Sum(sales)
FROM
Fact
WHERE
Sales.category in
(SELECT
Has_categories
FROM
Category_cross_reference a,
Category_cross_reference b
WHERE
a.has_categories = b.is_a_category
AND
a.is_a_category = 'Juice'
)
;


This type of query is repeated for all relevant categories in the database. In most Oracle8i warehouses, these rollup summaries are scheduled to run overnight, so the summary tables are available during the normal workday.

Using summary tables


Most Oracle8i data warehouses precreate these summary tables to improve response times. The secret of Oracle8i data warehousing is to provide the end users with the "illusion" of instant response time. As fast as Oracle is, a large data warehouse query involving millions of rows can run for many hours.

However, we have now created dozens, if not hundreds, of summary tables. How do we easily direct the end-user queries to use these queries? Oracle8i provides two mechanisms for using summary tables. The first is materialized views. In a materialized view, the Oracle8i SQL parse-searches for any references to sum functions and rewrites the query automatically to use the summary table. The Oracle8i documentation provides extensive information on this technique. Another popular approach is to use the Oracle Express front end. Within Oracle Express, the summary tables can be identified and automatically used for the queries.

Conclusion

As Oracle data warehouses become more complex, the DBA is always challenged to provide mechanisms to allow end users to have fast access to summaries of millions of rows of detailed sales data. Only with the preclassification and intelligent rollup of information can an Oracle data warehouse provide instantaneous access to the relevant data the end user requires for making important business decisions.




 


 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2012 

All rights reserved.

Oracle © is the registered trademark of Oracle Corporation.