Call now: 252-767-6166  
Oracle Training Oracle Support Development Oracle Apps

 
 Home
 E-mail Us
 Oracle Articles
New Oracle Articles


 Oracle Training
 Oracle Tips

 Oracle Forum
 Class Catalog


 Remote DBA
 Oracle Tuning
 Emergency 911
 RAC Support
 Apps Support
 Analysis
 Design
 Implementation
 Oracle Support


 SQL Tuning
 Security

 Oracle UNIX
 Oracle Linux
 Monitoring
 Remote s
upport
 Remote plans
 Remote
services
 Application Server

 Applications
 Oracle Forms
 Oracle Portal
 App Upgrades
 SQL Server
 Oracle Concepts
 Software Support

 Remote S
upport  
 Development  

 Implementation


 Consulting Staff
 Consulting Prices
 Help Wanted!

 


 Oracle Posters
 Oracle Books

 Oracle Scripts
 Ion
 Excel-DB  

Don Burleson Blog 


 

 

 


 

 

 

 

 

Drilling into the SVM Data

Data warehouse tips by Burleson Consulting

This is an excerpt from Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse".

Taking the high variability of the NOXdata into account, we?re still not sure whether settling near the Charles River would be associated with higher air pollution or not.  Running the following SQL statement for the towns with highest and lowest coefficientsin our linear SVM model reveals that towns more likely near the river have on average lower nitric oxide concentrations than those that are not.

select AVG(NOX) from "BOSTON_PRICE" where TOWN IN ('Somerville', 'Arlington', 'Belmont', 'Brookline') = 5.822.

select AVG(NOX ) from " BOSTON_PRICE " where TOWN IN ('Dedham', 'Waltham', 'Dover', 'Watertown') = 4.905.
 

Given these results, we can conclude that lower levels of pollution appear to be associated with properties that border the Charles River, and that higher levels of pollution are highly variable and perhaps not well modeled with the 20 attributes in our case dataset.

Using Text Data in SVM Predictive Models

We have shown the usefulness of the SVM algorithmfor modeling categorical and continuous data, we?ll now examine how to utilize text data in our predictive models.  The dataset is found at http://kdd.ics.uci.edu/summary.task.type.html and is the Syskill Webert Web Data, which contains HTML source web pages along with the ratings of a single user on these web pages.

The web pages are on four separate subjects:  bands of recording artists, goats, sheep, and biomedical.  Users looked at each web page and rated the content on a three point scale (hot, medium, cold).  However, there were very few ratings for medium. 

The Web rating data is organized into 4 folders:  bands, biomedical, goats and sheep.  The folders have a number of files containing web page content and a single file named index which relates viewer ratings to each of the web pages.  We will create a table with the web content stored as CLOBtype data, and then match this with the index file so that we have an ID field, viewer rating, category of web page, and the web page content. 

The steps in arriving at this final SVM table are as follow:

1.      Import the index table for each subject using the import wizard in ODMr.

2.      Create a table for importing  the CLOB data.

3.      Use sqlldr to import the web content as CLOB fields.

4.      Create views for each category of web page by joining the index and CLOB tables for each subject.

5.      Union all four views together into a final table.

6.      Create a unique identifier for the cases in this table.

Using the Import Wizard in ODMris straightforward.  Rename the index file with a ?.dat? extension before attempting the import, and specify Vertical Bar (|) as the delimiter.  The field names are file_name, rating, url, date_rated, and title.  Import each file into a separate table such as web_rating_goats, web_rating_sheep, web_rating_bands, and web_rating_biomed.

 

For more tips and tricks for Oracle data warehouse analysis, see Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse"

You can buy it direct from the publisher for 30%-off:

http://www.rampant-books.com/book_2006_1_oracle_data_mining.htm


 

 
��  
 
 
Oracle Training at Sea
 
 
 
 
oracle dba poster
 

 
Follow us on Twitter 
 
Oracle performance tuning software 
 
Oracle Linux poster
 
 
 

 

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:  

and include the URL for the page.


                    









Burleson Consulting

The Oracle of Database Support

Oracle Performance Tuning

Remote DBA Services


 

Copyright © 1996 -  2017

All rights reserved by Burleson

Oracle ® is the registered trademark of Oracle Corporation.

Remote Emergency Support provided by Conversational