Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 E-mail Us
 Oracle Articles
New Oracle Articles

 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog

 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Oracle Support

 SQL Tuning

 Oracle UNIX
 Oracle Linux
 Remote s
 Remote plans
 Application Server

 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S


 Consulting Staff
 Consulting Prices
 Help Wanted!


 Oracle Posters
 Oracle Books

 Oracle Scripts

Don Burleson Blog 









Preparing Datasets for Data Mining Activities

Data warehouse tips by Burleson Consulting

This is an excerpt from Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse".

The Missing Values, Normalize, Numeric and Outlier Treatment wizards are useful for prepping the data prior to applying data mining algorithms.  Most algorithms have a preferred method for handling missing values, normalizing, and outliers, so in most data mining tasks you can relax and let the Activity wizard take care of this.  In certain situations you may have unusual anomalies and if you wish you can take advantage of these wizards to help prepare your data for analysis.

Using the Missing Values Transformation Wizard

In the Missing Values Numerical Strategy, you have many choices for replacing the missing values, including None, Mean, Max, Min, and Custom Value.  The Mean treatment replaces a missing value with the average of the values for that attribute; max substitutes missing values with the maximum of the values, and min replaces missing values with the minimum of the values.  The default custom value is zero; you can replace this with any appropriate value. 

If the value is NULL, you can drop the case entirely by specifying Drop attribute. 

The SQL statement is shown below that is automatically generated by the Missing Values Transformation Wizardfor clinical patient data where missing values for attribute ACV_CODE is replaced with the mode (?E?), ANGINA_PROCEDURE is replaced by ?99?, and rows are dropped when ADULT_ASTHMA , BACTERIAL_PNEUMONIA, CHF, and COPD  are NULL.














'E' , "ACV_CODE" ) "ACV_CODE" ,













Using the Normalize Transformation Wizard

The Normalize transform is used to normalize data using a predefined scheme, or you can select a transformation for any numeric attribute.  The available transformationsinclude:

(x-MIN(x)) / (MAX(x) - MIN(x)) * (new max ? new min) + new min

(x ? AVG(x)) / SQRT(VARIANCE(x))

(x / MAX(ABS(MIN(x)), ABS(MAX(x))))

For example, if the {(x-MIN(x)) / (MAX(x) - MIN(x)) * (new max ? new min) + new min} normalizationscheme is chosen, for an average value 253.5, standard deviation 146.21, with minimum value = 1 and maximum value = 506, the transformed average value is 0.5, standard deviation 0.29, minimum value = 0 and maximum value = 1.

For the {(x ? AVG(x)) / SQRT(VARIANCE(x))} scheme, the transformed values average 0.0 with standard deviation 1, minimum value = -1.73 and maximum = 1.73.

The {(x / MAX(ABS(MIN(x)), ABS(MAX(x))))} transformation results in an average of 0.5 with standard deviation 0.29, minimum value = 0 and maximum = 1.


For more tips and tricks for Oracle data warehouse analysis, see Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse"

You can buy it direct from the publisher for 30%-off:


Oracle Training at Sea
oracle dba poster

Follow us on Twitter 
Oracle performance tuning software 
Oracle Linux poster


Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational