|
 |
|
Multivariate Data Mining
Oracle Tips by Burleson Consulting |
High Performance Data Warehousing
Problem Solving For The Oracle Warehouse
Multivariate Analysis
This type of analysis includes comparing various classifications of
database objects. Multivariate analysis involves the use of
chi-square statistical techniques to compare ranges of values among
different classifications of objects. For example, a supermarket may
keep a data warehouse of each customer transaction. Multivariate
techniques are used to search for correlations among items, so the
supermarket management can place these correlated items nearby on
the shelves. One supermarket found that whenever males purchased
diapers, they were also likely to buy a six-pack of beer! As a
consequence, the supermarket was able increase sales of beer by
placing a beer display between the diaper displays and the checkout
lines.
Multivariate analysis is normally used when the answers to the query
are unknown and is commonly associated with data mining and neural
networks. For example, whereas a statistical analysis may query to
see what the correlation is between customer age and probability of
diaper purchases when the end user suspects a correlation,
multivariate analysis is used when the end user does not know what
correlation may exist in the data. A perfect example of multivariate
analysis can be seen in the analysis of the Minnesota Multi-phasic
Personality Inventory (MMPI) database. MMPI is one of the most
popular psychological tests in America, and millions of Americans
have taken this exam. By comparing psychological profiles of
subjects with diagnosed disorders to their responses to the exam
questions, Psychologists have been able to generate unobtrusive
questions which are very highly correlated with a specific mental
illness. One example question relates to a subject’s preference to
take showers versus baths. Answers to this question are very highly
correlated with the MMPI’s measure for self-esteem. (It turns out
that the correlation showed that shower-takers tend to have
statistically higher self-esteems than bath-takers.)
Note that the users of this warehouse do not seek answers about why
the two factors are correlated; they simply look for statistically
valid correlations. This approach has made the MMPI one of the most
intriguing psychological tests in use today; by answering the
seemingly innocuous 500 True/False questions, psychologists can gain
an incredible insight into the personality of a respondent.
This is an excerpt from "High Performance
Data Warehousing", copyright 1997.
 |
If you like Oracle tuning, you may enjoy the book
Oracle Tuning: The Definitive Reference , with over 900 pages of BC's favorite tuning
tips & scripts.
You can buy it directly from the publisher and save 30%, and get
instant access to the code depot of Oracle tuning scripts. |
|