Home
E-mail Us
Oracle Articles
New Oracle Articles

Oracle Training
Oracle Tips
Oracle Forum
Class Catalog

Remote DBA
Oracle Tuning
Emergency 911
RAC Support
Apps Support
Analysis
Design
Implementation
Oracle Support

SQL Tuning
Security
Oracle UNIX
Oracle Linux
Monitoring
Remote support
Remote plans
Remote services
Application Server
Applications
Oracle Forms
Oracle Portal
App Upgrades
SQL Server
Oracle Concepts
Software Support
Remote Support
Development
Implementation

Consulting Staff
Consulting Prices
Help Wanted!

Oracle Posters
Oracle Books
Oracle Scripts
Ion
Excel-DB

Don Burleson Blog

Oracle Decision Support Systems and Data Warehouses

Oracle Data Warehouse Tips by Burleson Consulting

Decision Support Systems And Data Warehouses

Decision support systems (DSS) are generally defined as the class of warehouse system that deals with solving a semi-structured problem. In other words, the task has a structured component as well as an unstructured component. In short, the unstructured component involves human intuition and requires human interaction with the DSS. The well-structured components of a DSS are the decision rules stored as the problem-processing system. The intuitive, or creative, component is left to the user.

The following represent some examples of semi-structured problems:

* Choosing a spouse. While there are many structured rules (I want someone of my religion, who is shorter than me), there is still the unstructured, unquantifiable component to the process of choosing a spouse.

* Choosing a site for a factory. This is a nonrecurring problem that has some structured components (cost of land, availability of workers, and so on), but there are many other unstructured components in this decision (i.e., quality of life).

* Choosing a stock portfolio. Here the structured rules are the amount of risk and the performance of stocks, but the choice of stocks for a portfolio requires human intuition.

Decision support technology recognizes that many tasks require human intuition. For example, the process of choosing a stock portfolio is a task which has both structured and intuitive components. Certainly, rules are associated with choosing a stock portfolio, such as diversification of the stocks and choosing an acceptable level of risk. These factors can be easily quantified and stored in a database system, allowing the user of the system to create what-if scenarios. However, just because a system has well-structured components does not guarantee that the entire decision process is well-structured.

One of the best ways to tell if a decision process is semi-structured is to ask the question, Do people with the same level of knowledge demonstrate different levels of skill? For example, it?s possible for many stock brokers to have the same level of knowledge about the stock market. However, these brokers will clearly demonstrate different levels of skill when assembling stock portfolios.

Computer simulation is one area used heavily within the modeling components of decision support systems. In fact, one of the first object-oriented languages was SIMULA. SIMULA was used as a driver for these what-if scenarios and was incorporated into decision support systems so that users could model a particular situation. The user would create a scenario with objects subjected to a set of predefined behaviors.

In order to be a DSS, a system must have the following characteristics:

* A nonrecurring problem needs to be solved. DSS technology is used primarily for novel and unique modeling situations that require the user to simulate the behavior of some real-world problem.

* Human input is required. A DSS makes decisions with users, unlike an expert system which makes decisions for users.

* A method is available for testing hypotheses. A true DSS allows the end user to develop models and simulate changes to the model. For example, the end user could ask questions like, ?What will happen to my net return if I exchanged my IBM stock for Microsoft stock?? or ?How much faster would I be able to service my customers if I add two more checkout registers??

* Users must have knowledge of the problem being solved. Unlike an expert system that provides the user with answers to well-structured questions, decision support systems require the user to thoroughly understand the problem being solved. For example, a financial decision support system, such as the DSSF product, would require the user to understand the concept of a stock Beta. Beta is the term used to measure the covariance of an individual stock against the behavior of the market as a whole. Without an understanding of the concepts, a user would be unable to effectively use a decision support system.

* Ad hoc data queries are allowed. As users gather information for their decision, they make repeated requests to the online database, with one query answer stimulating another query. Because the purpose of ad hoc query is to allow free-form queries to decision information, response time is critical.

* More than one acceptable answer may be produced. Unlike an expert system, which usually produces a single, finite answer to a problem, a decision support system deals with problems that have a domain or range of acceptable solutions. For example, a user of DSSF may discover that many acceptable stock portfolios match the selection criteria of the user. Another good example is a manager who needs to place production machines onto an empty warehouse floor. The goal would be to maximize the throughput of work in process from raw materials to finished goods. Clearly, she could choose from a number of acceptable ways of placing the machines on the warehouse floor in order to achieve this goal. This is called the state space approach to problem-solving--first a solution domain is specified, then the user works to create models to achieve the desired goal state.

* External data sources are used. For example, a DSS may require classification of customers by Standard Industry Code (SIC) or customer addresses by Standard Metropolitan Statistical Area (SMSA). Many warehouse managers load this external data into the central warehouse.

Decision support systems also allow the user to create what-if scenarios. These are essentially modeling tools that allow the user to define an environment and simulate the behavior of that environment under changing conditions. For example, the user of a DSS for finance could create a hypothetical stock portfolio and then direct the DSS to model the behavior of that stock portfolio under different market conditions. Once these behaviors are specified, the user may vary the contents of the portfolio and view the results.

The types of output from decision support systems include:

* Management information systems (MIS)--Standard reports and forecasts of sales.

* Hypothesis testing--Did sales decrease in the Eastern region last month because of changes in buying habits? This involves iterative questioning, with one answer leading to another question.

* Model building--Creating a sales model, and validating its behavior against the historical data in the warehouse. Predictive modeling is often used to forecast behaviors based on historical factors.

* Discovery of unknown trends--For example, why are sales up in the Eastern region? Data mining tools answer questions in those instances where you may not even know what specific questions to ask.

The role of human intuition in this type of problem solving has stirred great debate. Decision support systems allow the user to control the decision-making process, applying his or her own decision-making rules and intuition to the process. However, the arguments for and against using artificial intelligence to manage the intuitive component of these systems has strong proponents on both sides.

Now that expert systems and decision support systems have been described, let?s take a look at how databases are used to develop these systems.

Data Warehouses And Multidimensional Databases

Multidimensional databases are approaching the DSS market through two methods. The first approach is though niche servers that use proprietary architecture to model multidimensional databases. Examples of niche servers include Arbor and IRI. The second approach is to provide multidimensional front ends that manage the mapping between the RDBMS and the dimensional representation of the data. Figure 1.15 offers an overview of the various multidimensional databases.

Figure 1.15 The major types of multidimensional databases.

In general, the following definitions apply to data warehouses:

* Subject-oriented data--Unlike an online transaction processing application that is focused on a finite business transaction, a data warehouse attempts to collect all that is known about a subject area (i.e., sales volume, interest earned) from all data sources within the organization.

* Read-only during queries--Data warehouses are loaded during off-hours and are used for read-only requests during day hours.

* Highly denormalized data structures--Unlike an OLTP system with many ?narrow? tables, data warehouses pre-join tables, creating fat tables with highly redundant columns.

* Data is pre-aggregated--Unlike OLTP, data warehouses pre-calculate totals to improve runtime performance. Note that pre-aggregation is anti-relational, meaning that the relational model advocates building aggregate objects at runtime, only allowing for the storing of atomic data components.

* Features interactive, ad hoc query--Data warehouses must be flexible enough to handle spontaneous queries by users. Consequently, a flexible design is imperative.

When we contrast the data warehouse with a transaction-oriented, online system, the differences become apparent. These differences are shown in Table 1.1.

	OLTP	Data Warehouse
Normalization	High (3NF)	Low (1NF)
Table Sizes	Small	Large
Number of rows/table	Small	Large
Size/duration of transactions	Small	Large
Number of online users	High (1000s)	Low (< 100)
Updates	Frequent	Nightly
Full-table scans	Rarely	Frequently
Historical data	<90 days	Years

Table 1.1 Differences between OLTP and data warehouses.

Aside from the different uses for data warehouses, many developers are using relational databases to build their data warehouses and simulate multiple dimensions. Design techniques are being used for the simulations. This push toward STAR schema design has been somewhat successful, especially because designers do not have to buy a multidimensional database or invest in an expensive front-end tool. In general, using a relational database for OLAP is achieved by any combination of the following techniques:

* Pre-joining tables together--This is an obtuse way of saying that a denormalized table is created from a normalized online database. A large pre-join of several tables is sometimes called a fact table in a STAR schema.

* Pre-summarization--This prepares the data for any drill-down requests that may come from an end user. Essentially, the different levels of aggregation are identified, and aggregate tables are computed and populated when the data is loaded.

* Massive denormalization--The side effect of very inexpensive disks has been the rethinking of the merits of third normal form. Today, redundancy is widely accepted, as seen by the popularity of replication tools, snapshot utilities, and non-first-normal-form databases. If you can pre-create every possible result table at load time, your end user will enjoy excellent response time when making queries. The STAR schema is an example of massive de-normalization.

* Controlled periodic batch updating--New detail data is rolled into the aggregate table on a periodic basis while the online system is down, with all summarization recalculated as the new data is introduced into the database. While data loading is important, it is only one component of the tools for loading a warehouse. There are several categories of tools that can be used to populate warehouses, including:

* Data extraction tools--Different hardware and databases.

* Metadata repository--Holds common definitions.

* Data cleaning tools--Tools for insuring uniform data quality.

* Data sequencing tools--RI rules for the warehouse.

* Warehouse loading tools--Tools for populating the data warehouse.

Data Extraction For The Oracle Warehouse

As we know, most data warehouses are loaded in batch mode after the online system has been shut down. In this sense, a data warehouse is bimodal, with a highly intensive loading window, and an intensive read-only window during the day. Because many data warehouses collect data from non-relational databases such as IMS or CA-IDMS, no standard methods for extracting data are available for loading into a warehouse. However, there are a few common techniques for extracting and loading data, including:

* Log ?sniffing?--Applying archived redo logs from the OLTP system to a data warehouse.

* Using update, insert, and delete triggers--Firing-off a distributed update to a data warehouse.

* Using snapshot logs to populate the data warehouse--Using log files to update replicated table changes.

* Running nightly extract/load programs--Using extracts to retrieve operational data and load it into a warehouse.

For details about data extraction and loading of Oracle warehouses, see Chapter 11, Oracle Data Warehouse Utilities.

If you like Oracle tuning, see the book "Oracle Tuning: The Definitive Reference", with 950 pages of tuning tips and scripts.

You can buy it direct from the publisher for 30%-off and get instant access to the code depot of Oracle tuning scripts.

��