 |
|
Viewing Adaptive Bayes Network Results
Data warehouse tips by Burleson Consulting |
This is an excerpt from Dr. Ham's premier book "Oracle
Data Mining: Mining Gold from your Warehouse".
After completion of the Mining Activity,
we click on “Result” under “Test Metrics”. Since there were seven possible
outcomes of the target attribute, under the Accuracy tab you can see the
percentage of correctly classified cases for each value of forest cover.
As we see, the best predictions were made
for Target value = 5 (aspen), with 93% correctly classified. The model
misclassified all cases of ponderosa pine, with Target value = 3.
The Adaptive Bayes Network model seems to
be good at predicting scarcer types of forest covers.
The default build settings are used to
find a model that is good at predicting all classes by optimizing the Maximum
Average Accuracy
of the model, which is 60% in this
example. You can choose to build a model to maximize overall accuracy,
but generally you’ll want one that attempts to classify all the classes.
What has the model used to decide which
forest cover to predict? To view the human-readable rules, click on Result in
the Build step of the Mining Activity,
and look under the Rules tab.
Here you see that the model used wilderness
area and elevation to classify forest cover. Reading over the rules, you can
see common elements for the various types of trees, for example rules # 70 and #
75. The wilderness area can be either 1 or 0, and if the elevation is = 4 then
the forest cover is = 1.
Interpreting
Adaptive Bayes Network
Results
The Adaptive Bayes Network model was very
good at predicting forest cover = 7, and you can see from rules # 76 and #71
that this type of tree grows outside the wilderness area at elevation = 5. To
see what elevations are grouped in bin 5, we can return to the Show Summary
Single-Record and examine the histogram
for Elevation. Using Equal Width
Strategy, group 5 contains elevations from 2858.5 to 3058.4 feet.
The Support (%)
for a rule is the percentage of cases in the build dataset
having the predicted target value. For rule #73, the percent confidence is high
at 95%, but support is low at 2.5%, indicating that there is a marked
improvement in accuracy provided by this rule, but it works for only a few
cases. When a single feature model is applied to another dataset, the output of
the apply activity identifies the rule used to predict the classification result
for each case.
The rules for this dataset are quite simple,
having only two attributes in the If condition, because we used a small sample
(10,000 rows) from the case dataset. In larger datasets, the rules can become
very complex. The optimal number of rows of data depends on the nature of the
data, and can be as high as 100,000 records. As a general rule of thumb, you
should not expect meaningful rules unless the case data if over 20,000 rows.
Let’s say that we were really interested in
classifying ponderosa pine (Target value = 3), which this first model completely
misclassified. How do we influence the model to detect this type of forest
cover? We have the choice of two different methods, Priorsand Costs, which
will build bias into our model.