Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 

 

 

Comparing Data Sub-Sets with K-Means

Data warehouse tips by Burleson Consulting

This is an excerpt from Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse".

Although the numbers of customers in these clusters are a small percentage of the whole sample, insights as to how subsets of cases behave in relation to others may help target areas where taking some action may substantially impact the overall business practice. 

K-Means gives you the rules for deriving each cluster, so that you may apply the rules to another dataset.  For example, the rule for Cluster #16 is shown below.  The algorithm provides probabilistic scoring and gives you the confidence and percent support.  Note that the rule is written in such a manner that IF A implies (THEN) B (Cluster = 16). 

The confidence of the rule is the conditional probability of B given A (A implies B) = probability (B given A).  Support for a rule is an estimate of the number of cases in the training dataset for which the rule is true.  For Cluster #16 the confidence is 82% and the support estimate is 69 cases. 

IF

AAANHANG in (0.0) and ABESAUT in (0.0) and ABRAND in (1.0) and ABROM in (0.0) and ABYSTAND in (0.0) and AFIETS in (0.0) and AGEZONG in (0.0) and AINBOED in (0.0) and ALEVEN in (0.0) and AMOTSCO in (0.0) and APERSAUT in (1.0) and APERSONG in (0.0) and APLEZIER in (0.0) and ATRACTOR in (0.0) and AVRAAUT in (0.0) and AWABEDR in (0.0) and AWALAND in (0.0) and AWAOREG in (0.0) and AWAPART in (1.0) and AWERKT in (0.0) and AZEILPL in (0.0) and CARAVAN in (1.0) and MAANTHUI in (1.0) and MAUT0 <= 3.6 and MAUT0 >= 0.0 and MAUT1 <= 9.0 and MAUT1 >= 4.8 and MAUT2 <= 2.5 and MAUT2 >= 0.0 and MBERARBG <= 4.2 and MBERARBG >= 0.0 and MBERARBO <= 4.2 and MBERARBO >= 0.0 and MBERBOER <= 0.30000000000000004 and MBERBOER >= 0.0 and MBERHOOG <= 6.3 and MBERHOOG >= 0.0 and MBERMIDD <= 7.2 and MBERMIDD >= 0.0 and MBERZELF <= 1.2 and MBERZELF >= 0.0 and MFALLEEN <= 4.2 and MFALLEEN >= 0.0 and MFGEKIND <= 6.4 and MFGEKIND >= 0.0 and MFWEKIND <= 9.0 and MFWEKIND >= 0.9000000000000001 and MGEMLEEF <= 4.2 and MGEMLEEF >= 1.8 and MGEMOMV in (2.0,4.0) and MGODGE <= 5.6000000000000005 and MGODGE >= 0.0 and MGODOV <= 2.4 and MGODOV >= 0.0 and MGODRK <= 2.1 and MGODRK >= 0.0 and MGODRP <= 6.3 and MGODRP >= 1.8 and MHHUUR <= 9.0 and MHHUUR >= 0.0 and MHKOOP <= 9.0 and MHKOOP >= 0.0 and MINK123M <= 0.2 and MINK123M >= 0.0 and MINK3045 <= 5.3999999999999995 and MINK3045 >= 0.0 and MINK4575 <= 5.6 and MINK4575 >= 0.0 and MINK7512 <= 2.4 and MINK7512 >= 0.0 and MINKGEM <= 6.3 and MINKGEM >= 2.8000000000000003 and MINKM30 <= 5.6 and MINKM30 >= 0.0 and MKOOPKLA <= 8.0 and MKOOPKLA >= 1.7000000000000002 and MOPLHOOG <= 4.2 and MOPLHOOG >= 0.0 and MOPLLAAG <= 6.3 and MOPLLAAG >= 0.0 and MOPLMIDD <= 5.6 and MOPLMIDD >= 0.0 and MOSHOOFD <= 9.1 and MOSHOOFD >= 1.0 and MOSTYPE <= 41.0 and MOSTYPE >= 1.0 and MRELGE <= 9.0 and MRELGE >= 5.0 and MRELOV <= 4.2 and MRELOV >= 0.0 and MRELSA <= 2.1 and MRELSA >= 0.0 and MSKA <= 6.3 and MSKA >= 0.0 and MSKB1 <= 4.5 and MSKB1 >= 0.0 and MSKB2 <= 4.2 and MSKB2 >= 0.0 and MSKC <= 7.2 and MSKC >= 0.0 and MSKD <= 2.4 and MSKD >= 0.0 and MZFONDS <= 9.0 and MZFONDS >= 1.8 and MZPART <= 7.2 and MZPART >= 0.0 and PAANHANG in (0.0) and PBESAUT in (0.0) and PBRAND <= 4.2 and PBRAND >= 2.8000000000000003 and PBROM <= 0.2 and PBROM >= 0.0 and PBYSTAND in (0.0) and PFIETS in (0.0) and PGEZONG in (0.0) and PINBOED in (0.0) and PLEVEN <= 0.2 and PLEVEN >= 0.0 and PMOTSCO <= 0.30000000000000004 and PMOTSCO >= 0.0 and PPERSAIT in (6.0) and PPERSONG in (0.0) and PPLEZIER <= 0.1 and PPLEZIER >= 0.0 and PTRACTOR in (0.0) and PVRAAUT in (0.0) and PWABEDR in (0.0) and PWALAND in (0.0) and PWAOREG in (0.0) and PWAPART in (2.0) and PWERKT in (0.0) and PZEILPL in (0.0)

THEN

Cluster equal 16

Confidence (%)=82.1428571428571

Support =69

When to use K-Means Analysis

K-Means is recommended for datasets with low numbers of attributes (less than 500).  The number of clusters is specified by the user (the default is 10), and normalizationof the dataset is recommended to prepare the data for analysis.  The advantages of using enhanced k-Means clustering (the algorithm included in Oracle Data Miner) is that it provides results based on the algorithm that are superior to the results obtained with traditional k-Means techniques utilized in other data mining programs.

 

For more tips and tricks for Oracle data warehouse analysis, see Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse"

You can buy it direct from the publisher for 30%-off:

http://www.rampant-books.com/book_2006_1_oracle_data_mining.htm


 

 
��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational