Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

Free Oracle Tips

HTML Text

 Home
 E-mail Us
 Oracle Articles


 Oracle Training
 Oracle News

 Oracle Forum
 Class Catalog


 Our Staff
 Our Prices
 Help Wanted!

 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 UNIX
 Oracle UNIX
 Linux
 Oracle Linux
 Monitoring
 Remote help

 Remote plans
 Remote
services
 Oracle C++
 Oracle Java
 Apache
 JDeveloper
 App Server

 Applications
 Oracle Forms
 Oracle Portal
 11i Upgrades
 SQL Server
 Oracle Concepts
 HTML-DB Tips
 Software Help

 Remote Help  
 Development  

 Implementation


 Financials Training
 Oracle 11i
 Oracle Apps 11i
 Oracle Workflow
 Oracle AR 11i Class
 Oracle AP 11i class
 Oracle GL 11i class
 Oracle HR 11i class
 Oracle FA 11i class
 11i Project Mgt
 11i procurement
 11i collections


 Oracle Posters
 Oracle Books

 Oracle Tuning Book
 Oracle RAC Book
 Oracle Security
 Easy Oracle Books
 Oracle Scripts
 SQL Server DBA
 SQL Design Patterns
 Ion
 Excel-DB   


 BC Oracle News


 Rednecks!
 Dress code
 Arabian Stallion

 Burleson Arabians
 Guide Horses
 Don Burleson Blog
 Golf & Travel


 Privacy Policy
 

 

 

 
 

Adaptive Bayes Network Single Feature Model

Data warehouse tips by Burleson Consulting

This is an excerpt from Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse".

But first, we’ll build a new Adaptive Bayes Networkmodel, and instead of using the Single Feature model type as we did previously, choose Multi Feature, maximize the Overall Accuracy, set the target attribute for forest cover = 3, and sample 20,000 rows. 

Multi Feature will not give us the rules as before, but will build multiple features to improve the model with each feature, and result in a more effective model.

The results of our new model has improved the overall accuracyfrom 21% to 45%, and correctly classified 42% of the ponderosa pines, while keeping the overall accuracy around 60%.  Lower cost is an indicator of improved prediction, and the cost of the Multi Feature model is 11804 compared to 13454 in the Single Feature model. 

By changing the type of model used to classify our data, we were able to influence the predictive accuracy of our model.  As mentioned earlier, there are two ways to nudge the model into producing different results that we might be more interested in seeing. 

One of the methods is to introduce cost bias into our build model, and the methods for doing this were described in Chapter One.  To review, go to the ROCtab in the result viewer of your Mining Activity

The Receiver Operator Characteristics metric shows the change in probability given modification to the Cost Matrix.   We want to predict more of the ponderosa pines and avoid false negative predictions.  Under the ROCcurve, there are two boxes labeled False Positiveand False NegativeCost.  Type in “3” in the False Negative Cost box, telling the model that a false negative is three times more costly than a false positive error, and click Compute Cost.  Note that the red line jumps to the right and in the detail section the line with probability threshold 0.216 is highlighted.  The confusion matrixchanges to show that there are 22 false negatives, 200 false positives, 215 true positives and 3556 true negatives.  

To modify the model test results, return to the Mining Activityand click on Select ROCthreshold in the Test Metrics section.  The default costs for False Positive and False Negativeare assumed to be equal and are set to 1 by default. 

Now we change the setting from a probability of 0.5 to 0.216 and notice that the cost of false negative cost is now 3.63.  Click OK to save the settings and now see that the ROC Threshold is changed.  The new cost bias will be used when the model is applied to a dataset.

We have now built two types of Adaptive Bayes Networkmodels, the Single Feature and the Multi Feature.  There is one more type in ODMr which is the Pruned Naïve Bayes, very similar to the Naïve Bayes that we used in Chapter One, although the results will not be exactly what you’d get using the Naïve Bayes Classification modeldirectly.  In the next section we’ll look at Decision Trees, which like the Adaptive Bayes Network Single Feature modelgives rules for determining how cases are classified.

In many types of data the target values may comprise only a very small percentage of cases.  In hospital data, for example, preventable hospitalizations are somewhat rare events, and of all hospitalizations account for a very small percentage of admissions.  Likewise of all the users hitting a website, only a small number are initiated by hackers with malicious intent.  A classification model built on datasets containing only a few known cases will not be able to discriminate very effectively between the two classes. 

The model may in fact predict that no cases are preventable, or are hacker attempts, and it will be 98 to 100% correct!  However, we really have not learned anything about these cases, and the model is not very effective.

What we want to do is use case data for the model build that has approximately equal numbers of positive and negative cases.  However, the algorithm will take this distribution as if it were realistic, so we need to supply the actual distribution of target values, called the Prior Distribution(Priors), so the build process will result in more meaningful models.  The ODMrclassification models will use Priors when you specify stratified samplingin the Mining Activityadvanced settings for sampling. 

Next, we’ll try this using the Decision Tree classification model. 
 

For more tips and tricks for Oracle data warehouse analysis, see Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse"

You can buy it direct from the publisher for 30%-off:

http://www.rampant-books.com/book_2006_1_oracle_data_mining.htm


 

 

  
 

Oracle performance tuning software 
 
 
 
 

Oracle performance tuning book

 

 
Search oracle
 
Oracle performance Tuning 10g reference poster
 
 
 
Oracle training in Linux commands
 
Oracle training Excel
 
Oracle training & performance tuning books
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  and include the URL for the page.
 
 


Burleson Consulting

The Oracle of Database Support

&

Oracle Performance Tuning


 

Copyright © 1996 -  2009 by Burleson Enterprises, Inc. All rights reserved.

Oracle © is the registered trademark of Oracle Corporation.