Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 

 

 

Customizing Discretize Transformations

Data warehouse tips by Burleson Consulting

This is an excerpt from Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse".

To customize AGE cut-offs, choose the Define button.  Finding it easier to edit one of the default settings, choose Generate Default Bins, and enter ?4? as the number of bins.  You may have an Application Warning appear that tells you ?3? is a better binning choice; click OK to proceed. 

You can edit the lower bounds for each of the 4 bins listed, so let?s make the bins <25, 25-40, 40-50, and >50 by typing in 25, 40, and 50.  The resulting histogramshows that there are a disproportionate number of cases in the 25-40 range (41%). 

Although not optimal for building a model, you may need to use this binning strategy for your business application.  By changing the lower boundary to 30 the histogramis more uniformly distributed with each of the bins having 29, 28, 23, and 21% of the total number of cases, respectively. 

In Step 5 of 5 of the Discretize Transformation Wizard, you can change the bins of the categorical attributes.  For instance, to change the CUST_MARITAL_STATUS attribute, highlight the name and click Define. 

A window with existing strategies (if any) for that attribute appears, showing the Bin Category and Value.  You can add definitions by choosing Add, or for a shortcut choose ?Generate Default Bins?.  Keep the default number of bins = 10 and say OK to the warning that there are fewer than 10 bins in the case data if you choose the generate option.  The bin categories are now shown, with Married, NeverM, Divorc, Separ, Widowed, Mabsent, and Mar-AF corresponding to the category and values. 

Let?s say that we decide to re-bin the attribute to Married and Not_Married.  Highlight Married and click the Edit button.  The Married bin category corresponds to Married, so choose Divorc, Mabsent, Mar_AF, Separ to add to the Married category.  Click OK to complete the Married category.  Now there are 3 categories:  NeverM, Widowed, and Married.  Highlight NeverM and click Edit.  Re-name NeverM in the Bin Category to Not_Married, and add Widowed to the selected Bin Values. 

After clicking OK, you?ll see that the bin categories are now Married with values Divorc, Mabsent, Mar-AF, Married, and Separ; Not_Married with the values NeverM and Widowed.  Click OK and Next to finish. 

Right click on the new view created, choose Show Summary Single-Record, and view the histograms for AGE and CUST_MARITAL_STATUS. 

Now, we right click the new view and choose ?Show Lineage? to see details of the disretizetransformation. 

The wizard used a CASE and DECODE statement to create the bins.

SELECT 

"AFFINITY_CARD",( CASE WHEN "AGE" < 30 THEN ' < 30'

WHEN "AGE" <= 40 THEN '30 - 40'

WHEN "AGE" <= 50 THEN '40 - 50'

WHEN "AGE" > 50 THEN ' > 50'

else null end)  "AGE", "BOOKKEEPING_APPLICATION", "BULK_PACK_DISKETTES", "COUNTRY_NAME", "CUST_GENDER", "CUST_ID", "CUST_INCOME_LEVEL", DECODE ("CUST_MARITAL_STATUS"

,'Divorc.','Married'

,'Mabsent','Married'

,'Mar-AF','Married'

,'Married','Married'

,'Separ.','Married'

,'NeverM','Not_Married'

,'Widowed','Not_Married'

,NULL,NULL,'other') "CUST_MARITAL_STATUS", "EDUCATION", "FLAT_PANEL_MONITOR", "HOME_THEATER_PACKAGE", "HOUSEHOLD_SIZE", "OCCUPATION", "OS_DOC_SET_KANJI",

"PRINTER_SUPPLIES",( CASE WHEN "YRS_RESIDENCE" < 3 THEN ' < 3'

WHEN "YRS_RESIDENCE" <= 5 THEN '3 - 5'

WHEN "YRS_RESIDENCE" > 5 THEN ' > 5'

else null end)  "YRS_RESIDENCE", "Y_BOX_GAMES"

FROM "DMUSER_BOOK"."MINING_DATA_BUILD_V_US"

Using the Discretizewizard gives you the ability to easily create new attributes and bin the data to your liking.  The view created is used as the case table for data mining activities.  How many bins do you choose when doing the data prep depends on the nature of the data you are processing.  It is sometimes easy to categorize variables into ?High?, ?Medium? and ?Low? to simplify interpretation of the data mining results. 

 

For more tips and tricks for Oracle data warehouse analysis, see Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse"

You can buy it direct from the publisher for 30%-off:

http://www.rampant-books.com/book_2006_1_oracle_data_mining.htm


 

 
��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational