2
votes

I am new to Matlab. Is there any sample code for classifying some data (with 41 features) with a SVM and then visualize the result? I want to classify a data set (which has five classes) using the SVM method. I read the "A Practical Guide to Support Vector Classication" article and I saw some examples. My dataset is kdd99. I wrote the following code:

%% Load Data
[data,colNames] = xlsread('TarainingDataset.xls');
groups = ismember(colNames(:,42),'normal.'); 
TrainInputs = data;
TrainTargets = groups;
%% Design SVM
C = 100;
svmstruct = svmtrain(TrainInputs,TrainTargets,...
    'boxconstraint',C,...
    'kernel_function','rbf',...
    'rbf_sigma',0.5,...
    'showplot','false');
%% Test SVM
[dataTset,colNamesTest] = xlsread('TestDataset.xls');
TestInputs = dataTset;
groups = ismember(colNamesTest(:,42),'normal.'); 
TestOutputs = svmclassify(svmstruct,TestInputs,'showplot','false');

but I don't know that how to get accuracy or mse of my classification, and I use showplot in my svmclassify but when is true, I get this warning:

The display option can only plot 2D training data

Could anyone please help me?

2
You need to learn more about Machine Learning in general, its not a good or easy tool to use blindly.Raff.Edward
I agree with @Raff.Edward, but what you should be looking at is cross validation to measure your error / accuracy.David Maust

2 Answers

2
votes

I recommend you to use another SVM toolbox,libsvm. The link is as follow: http://www.csie.ntu.edu.tw/~cjlin/libsvm/

After adding it to the path of matlab, you can train and use you model like this:

model=svmtrain(train_label,train_feature,'-c 1 -g 0.07 -h 0'); 
% the parameters can be modified
[label, accuracy, probablity]=svmpredict(test_label,test_feaure,model);

train_label must be a vector,if there are more than two kinds of input(0/1),it will be an nSVM automatically.

train_feature is n*L matrix for n samples. You'd better preprocess the feature before using it. In the test part, they should be preprocess in the same way.

The accuracy you want will be showed when test is finished, but it's only for the whole dataset.

If you need the accuracy for positive and negative samples separately, you still should calculate by yourself using the label predicted.

Hope this will help you!

0
votes

Your feature space has 41 dimensions, plotting more that 3 dimensions is impossible. In order to better understand your data and the way SVM works is to begin with a linear SVM. This tybe of SVM is interpretable, which means that each of your 41 features has a weight (or 'importance') associated with it after training. You can then use plot3() with your data on 3 of the 'best' features from the linear svm. Note how well your data is separated with those features and choose a basis function and other parameters accordingly.