Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 

 
 

Running a Data Mining Activity

Data warehouse tips by Burleson Consulting

This is an excerpt from Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse".

While the Activity runs, the status bar shows which steps in sequence are being completed.  As shown in the activity screen, all steps from discretize to test metrics have been successful. 

Even though the data may already be binned, the algorithm will take any further steps needed to ensure that numerical data and categorical data are divided into appropriate bins.  The dataset is then separated into training and validation sets by random selection of cases. 

We will construct a classifier using the training dataset, and apply it to the validation set.  Test metrics are summarized and written in the result section. 

Viewing your Results

Now that we have built the classification model, we?ll take a look at the results. 

Click Result in the test metric step and view the Confusion Matrix.  Because the dataset is small and the training data was randomly selected, these results may be slightly different if you ran this analysis yourself. 

The first tab in the results window indicates that the Predictive Confidence is ?Good? in comparison to the na?e model. 

A very simple method for classifying customers is to classify the record as a member of the majority class.  In our dataset MINING_DATA_BUILD_V _US, the majority of cases (74.18%) have AFFINITY_CARD = 0. 

Ignoring all the predictor information that we have, the na?e rule would classify all customers as not having an affinity card because only 25.82% have a card. 

The na?e rule is commonly used as a baseline for evaluating the performance of classification models.  The predictive confidence of 57.83% indicates that the model we built is about 58% better than the na?e rule. 

The Accuracy tab takes us to the classification matrix, also called the confusion matrix, where the model is applied to the hold-out test sample.  Click More Detail button to view the confusion matrix.  The columns are the predictions made by the classification model and the rows are the actual data. 

As we see, the overall accuracyof the model is 77%, with 312 cases correctly classified as No-Affinity Card who did not have one, and 26 misclassified as not having one.  Similarly, 105 cases were accurately classified as having a card, and 127 were misclassified as having one.  The cases that the model misclassified are the false-negative and false-positive predictions. 

Oracle Data Mining Lift Curve

The "Lift" tab demonstrates two graphical interpretations of the results, the cumulative and cumulative positive cases chart.

We want our classification model to sift through the records and sort them according to which customers are more likely to respond to our mailing. 

The lift curve also called a gains curve or gains chart, is a popular technique in direct marketing. 

The lift curve will help us discover how to effectively ?skim the cream? of our mailing list so we pick the smallest number of cases with the greatest probability of answering our mailing campaign. 

ODMr applies the lift model to the test data, sorts the predicted results by probability, divides the ranked list into 10 equal parts (quantiles), and counts the actual positive values in each quantile. 

The test results indicate that if we take the top 40%, we will have at least twice the response expected from random sampling.  A good classifier will give us a high lift when we mail only a few customers. 
 

For more tips and tricks for Oracle data warehouse analysis, see Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse"

You can buy it direct from the publisher for 30%-off:

http://www.rampant-books.com/book_2006_1_oracle_data_mining.htm


 

 

��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational