Search BC Oracle Sites

# Support Vector Machines (SVM)

Data warehouse tips by Burleson Consulting

This is an excerpt from Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse".

Support Vector Machines (SVM) is a suite of algorithms that are used for classification applications.  Like the Adaptive Bayes Networkand Decision Treealgorithms discussed in Chapter Two, SVM provides rules that are useful in understanding the relationships and patterns in the dataset.

Inside Support Vector Machines

Another advantage to using SVM is that it can be used to predict outcomes based on text data, so if you have descriptive data such as clinical notes for hospital patients, customer satisfaction survey results, or other textual information it can be used as part of the classification model.

In this chapter we will show an example using SVM to predict how researchers rated web pages based on content in the web pages themselves.  We?ll also apply SVM to continuous data to illustrate the use of classification models in regression.  We will show how to use sqldr and ODMrto extract and load CLOBdata.

Let?s first examine the use of SVM in classifying discrete attributes, where the problem is to predict one or more values such as Yes or No, or as in the forest cover problem described in Chapter Two, predicting 7 different types of trees depending on altitude and environmental factors.  ODMrprovides two types of algorithms, or kernels, the Linear and Gaussian kernels.

These linear and non-linear equations are similar to statistical and artificial machine learning techniques such as neural networks and linear regression, but are much better in terms of prediction accuracy and speed in building the model.  Using the Linear kernel will give us a ranked listing of the attributes used to build the model, showing which attributes were most important in predicting the target class.  Let?s take a closer look.

Inside the SVM Analysis

We?ll use as our case data the daily average wind speeds for 1961 through 1978 at 12 synoptic meteorological stations in the Republic of Ireland.  Each row corresponds to one day of data, with the following attributes:  year, month, day, and average wind speed (in knots) at each of the stations in the following order:  RPT, VAL, ROS, KIL, SHA, BIR, DUB, CLA, MUL, CLO, BEL, and MAL.

We will create a new target class, season, which we will code using months, with months 12, 1, and 2 designated winter, 3 through 5 as spring, 6 through 8 as summer, and 9 through 11 as fall.

Importing the SVM Model Data

Using the import feature of ODMr, import the csv (comma delimited file), being sure to create a dat type file by renaming the file with the ?.dat? extension.  Enter new column names in Step 3 of the Import Wizard, name the new table wind_ireland, and finish the wizard to complete the data import.

To create the four seasons, on the Main Menu choose ?Data?, ?Transform?, and pick ?Compute Field?.  Choose a new name for the view such as "wind_ireland_V? and type ?season? for the new column name.  Enter the following statements in the ?Expression? box, and click on ?Validate? to ensure that the expression is valid.

case

when "wind_ireland"."month"  = 12 or

"wind_ireland"."month" = 1 or

"wind_ireland"."month" = 2 then 'winter'

when "wind_ireland"."month" = 3 or

"wind_ireland"."month" = 4 or

"wind_ireland"."month" = 5 then 'spring'

when "wind_ireland"."month" = 6 or

"wind_ireland"."month" = 7 or

"wind_ireland"."month" = 8 then 'summer'

else

'fall'

End

You can view the SQL code and save the script to a file when you preview the transformation.

1.      After you complete the compute wizard, right click the new view wind_ireland_v and choose Show Summary Single Record to view the new case data details.

2.      Click on the new attribute ?season? and check the histogram showing the relative distribution of values.

You can see that each season comprises about a quarter of the case data, so no need to set priors in the Build as we did in Chapter Two.

 For more tips and tricks for Oracle data warehouse analysis, see Dr. Ham's premier book "Oracle Data Mining: Mining Gold from your Warehouse"You can buy it direct from the publisher for 30%-off: http://www.rampant-books.com/book_2006_1_oracle_data_mining.htm

��

Burleson is the American Team

Note: This Oracle documentation was created as a support and Oracle training reference for use by our DBA performance tuning consulting professionals.  Feel free to ask questions on our Oracle forum.

Verify experience! Anyone considering using the services of an Oracle support expert should independently investigate their credentials and experience, and not rely on advertisements and self-proclaimed expertise. All legitimate Oracle experts publish their Oracle qualifications.

Errata?  Oracle technology is changing and we strive to update our BC Oracle support information.  If you find an error or have a suggestion for improving our content, we would appreciate your feedback.  Just  e-mail:

and include the URL for the page.

 Burleson Consulting The Oracle of Database Support Oracle Performance Tuning