DATA WAREHOUSES AND ROI
A data warehouse project should never begin
unless management feels that the benefits of a data warehouse
outweigh the cost. With that basic assumption under our belts, let?s
look at some of the differences between how the ROI for a data
warehouse compares to the ROI for other information systems
development projects.
The International Data Corporation (IDC)
conducted a survey in 1996 that studied ROI for data warehouse
projects. IDC found the following trends in data warehouse projects:
* Very fast payback--The average ROI for
a data warehouse is far above the industry average. Corporations
with complex organizational and customer environments benefit the
most. IDC found that more than 60 percent of data warehouse projects
have a payback period of less than 2 years.
* Large ROI variance--The variance of
ROI among organizations ranges from 3 percent to 1,800 percent. The
low ROI values are attributed to very expensive data warehouse
projects that take several years to develop and have a small amount
of usage.
* Higher ROI for data marts--Larger data
warehouses have a lower ROI than smaller data warehouses. IDC found
that databases larger than 200 GB had smaller ROI values than
smaller data warehouses. This difference is attributed to the extra
work required to integrate and maintain the diverse data sources.
* Application area differences--There
are differences among data warehouse ROIs based on the type of
organization that develops the data warehouse. Data warehouses
designed to support engineering and operations tend to have the
highest ROI. This makes sense because, historically, manufacturing
organizations have been among the first to embrace data warehousing.
The IDC study also shows that European companies developing data
warehouses lag behind American companies by a 100 percent margin of
ROI, with European companies averaging 340 percent ROI and American
companies averaging 440 percent ROI.
The overall finding in the IDC study showed
that data warehouses are popular primarily because of the fast
payback period for the dollar investment. Interestingly, the payback
period most likely will become even shorter as data warehouse
developers create more intelligent queries against their data and
become more adept at locating and analyzing trend information.
Warehouse Project Management
When embarking on a data warehousing
project, many pitfalls can cripple the project. Characteristics of
successful data warehouse projects generally include the following
aspects:
* Clear business justification for the
project--Measurable benefits must be defined for a warehouse project
(e.g., sales will increase by 10 percent, customer retention will
increase by 15 percent). Warehouses are expensive, and the project
must be able to measure the benefits.
* Staff is properly trained--Warehousing
involves many new technologies, including SMP, MPP, and MDDB. The
staff must be trained and comfortable with the new tools.
* Insuring data quality and
consistency--Warehouses deal with historical data from a variety of
sources, so care must be taken to create a metadata manager that
ensures common data definitions and records changes of historical
data definitions.
* Insuring subject privacy--Gathering
data from many sources can lead to privacy violations. A good
example of privacy violation is the hotel chain that targeted
frequent hotel customers and sent a frequent-user coupon to their
home addresses. Some spouses intercepted these mailings, leading to
numerous divorces.
* Allow the warehouse to start small and
evolve--Some projects fail by defining too broad of a scope for the
project. Successful projects consider their first effort as a
prototype and continue to evolve from that point.
* Ensure intimate end-user
involvement--Data warehouses cannot be developed in a vacuum. The
system must be flexible to address changing end-user requirements,
and the end-users must understand the architecture so they are aware
of the limitations of their warehouse.
* Properly plan the infrastructure--A
new infrastructure must be designed to handle communications among
data sources. Parallel computers must be evaluated and installed,
and staff must be appropriately educated.
* Perform proper data modeling and
stress testing--The data model must be validated and stress tested
so that the finished system performs at acceptable levels. A model
that works great at 10 GB may not function as the warehouse grows to
100 GB.
* Choose the right tools--Many projects
are led astray because of vendor hype. Unfortunately, many vendors
inappropriately label their products as ?warehouse? applications, or
they exaggerate the functionality of their tools.
Basic Project Management
As a general definition, a project is any
set of tasks with a specific objective to be completed within
certain specifications (including defined start and end dates) that
consumes capital resources. Given this simplistic definition of a
project, let?s define what project management is and how it applies
to a data warehouse project.
For every large data warehouse project,
traditional management must be replaced by a new type of management
that is temporary and very flexible, with a fast reaction time, and
able to respond rapidly to both internal and external changes. With
this type of management in place, data warehouse project management
encompasses the following activities:
* Defining work requirements
* Defining the quantity of work
* Defining the resources needed
* Monitoring the project by:
* Tracking progress (dates and
milestones)
* Comparing actual figures to
predicted figures
* Analyzing the impact of changes
* Making adjustments to the
project
While these tasks may seem mundane,
effective project management is critical to the success of a data
warehouse. Successful project management is defined as meeting the
objectives of a project within project and cost constraints, while
maintaining a desired level of performance and fully utilizing the
proper technology.
To effectively fulfill the project
management functions listed previously, data warehouse project
managers must be able to:
* Identify function responsibilities and
ensure that all activities are accounted for.
* Minimize the need for continuous
reporting.
* Identify the time limits for
scheduling.
* Identify a methodology for tradeoff
analysis (shifting resources).
* Measure the project accomplishments
against the plans.
* Identify and resolve problems quickly.
* Improve estimation capabilities for
future planning.
* Keep track of meeting project
objectives.
Data warehouse project management is
different from traditional management in several ways. First, while
the evolution of data warehouse queries may be perpetual, the
initial creation of the warehouse is a finite activity, and the
project manager must be able to deal with this temporary authority
because other managers are performing the staffing functions,
supplying members of the data warehouse team. To further confound
matters, the data warehouse project manager does not have direct
control over the financial resources.
Effective Project Management
In general, there are two levels of project
management: top-level project management, which controls the overall
warehouse project, and functional management, which incorporates
everyone involved in the operational details associated with each
specific milestone of the project.
The size of a warehouse project does not
really impact how the project is modeled and controlled. While there
are numerous tools, such as PERT (Project Evaluation and Review
Technique), that can be used for very large data warehouse projects,
all warehouse projects are fundamentally the same; the only
variables are the number of sub-projects and the complexity of
integrating the sub-projects.
A very large warehouse project, such as
building an enterprise-wide data model for a large corporation, may
involve thousands of milestones and man-centuries of effort, but
they still maintain the fundamental nature of a warehouse project.
The issues are purely a matter of scale. However, it is comforting
to see that a data warehouse project, even with a man-century of
effort, is relatively small when compared to other projects such as
building an aircraft carrier which could consume the full-time
efforts of thousands of people for several years. The term
?man-century? refers to 100 years of labor, and is equivalent to 100
people working full-time for a year. Table 2.3 shows three levels
of project size. As you can see, a data warehouse project is
generally classified as a medium-size project.
Project Size |
Number of Tasks |
Project Duration |
Small |
Hundreds |
Months |
Medium |
Thousands |
Years |
Large |
Millions |
Decades |
Table 2.3 Levels of project size.
The management of an organization must not
underestimate the importance of effective project management, and
effective project management includes troubleshooting.
Troubleshooting usually involves at least one of the following three
categories:
* High Costs--Cost overruns stemming
primarily from improper allocation of human resources.
* Project Delays?Project delays are
often a result of wasted resources (i.e., materials, people and so
forth), which can cause the premature or late delivery of project
resources.
* Poor Quality--Poor quality occurs when
a project does not meet performance or functionality objectives.
In any case, special care must be taken to
ensure that project management avoids as many unforeseen problems as
possible. One of the best ways to ensure that a data warehouse
project is created soundly is for the warehouse team to prepare a
complete description of the project, clearly stating all project
requirements and expectations up front. This description is called a
scope of work agreement.
 |
If you like Oracle tuning, see the book "Oracle
Tuning: The Definitive Reference", with 950 pages of tuning
tips and scripts.
You can buy it direct from the publisher for 30%-off and get
instant access to the code depot of Oracle tuning scripts. |