0
votes

I am doing a project on Image processing in Matlab and wish to implement LIBSVM for supervised learning.

I am encountering a problem in data preparation. I have the data in CSV format and when i try to convert it into libsvm format by using the information provided in LIBSVM faq:-

    matlab> SPECTF = csvread('SPECTF.train'); % read a csv file
    matlab> labels = SPECTF(:, 1); % labels from the 1st column
    matlab> features = SPECTF(:, 2:end); 
    matlab> features_sparse = sparse(features); % features must be in a sparse matrix
    matlab> libsvmwrite('SPECTFlibsvm.train', labels, features_sparse);

I get the data in the following form:

3.0012 1:2.1122 2:0.9088 ...... [value 1] [index 1]:[value 2] [index 2]:[value 3]

That is the first value takes no index and the value following the index 1 is value 2.

From what i had read, the data should be in the following format:

[label] [index 1]:[value 1] [index 2]:[value 2]......

[label] [index 1]:[value 1] [index 2]:[value 2]......

I need help to make this right. And also if anyone would give me a clue about how to give labels it will be really helpful.

Thanking you in advance, Sidra

2
For libsvm, you don't need to write the data to a file. Pass them directly to svmtrain and svmpredict.A. Donda
@ A. Donda : I am using LIBSVM (a library for support vector machines) csie.ntu.edu.tw/~cjlin/libsvm .For that i need to write data to a file. Please advice. And i will try to use svmtrain and svmpredict too,Thanks.Sid
I know that library, I use it myself. You don't need to write data to a file if you use the Matlab interface: <csie.ntu.edu.tw/~cjlin/libsvm/#matlab> The mex files for that should be included in the standard download package.A. Donda
@A.Donda : The function "libsvmwrite" writes data to a file right? libsvmwrite('filename', label_vector, instance_matrix) . Where 'filename' is the name of the file to which data is to be written. eg for 'filename' is 'heart.train' that is the example given even in the library.Sid
@A.Donda : if not using libsvmwrite how else do i do data preperation? Guide to please.Sid

2 Answers

1
votes

You don't have to write data to a file, you can instead use the Matlab interface to LIBSVM. This interface consists of two functions, svmtrain and svmpredict. Each function prints a help text if called without arguments:

Usage: model = svmtrain(training_label_vector, training_instance_matrix, 'libsvm_options');                                                                          
libsvm_options:                                                                                                                                                      
-s svm_type : set type of SVM (default 0)                                                                                                                            
        0 -- C-SVC                                                                                                                                                   
        1 -- nu-SVC                                                                                                                                                  
        2 -- one-class SVM                                                                                                                                           
        3 -- epsilon-SVR                                                                                                                                             
        4 -- nu-SVR                                                                                                                                                  
-t kernel_type : set type of kernel function (default 2)                                                                                                             
        0 -- linear: u'*v                                                                                                                                            
        1 -- polynomial: (gamma*u'*v + coef0)^degree
        2 -- radial basis function: exp(-gamma*|u-v|^2)
        3 -- sigmoid: tanh(gamma*u'*v + coef0)
        4 -- precomputed kernel (kernel values in training_instance_matrix)
-d degree : set degree in kernel function (default 3)
-g gamma : set gamma in kernel function (default 1/num_features)
-r coef0 : set coef0 in kernel function (default 0)
-c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1)
-n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5)
-p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1)
-m cachesize : set cache memory size in MB (default 100)
-e epsilon : set tolerance of termination criterion (default 0.001)
-h shrinking : whether to use the shrinking heuristics, 0 or 1 (default 1)
-b probability_estimates : whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0)
-wi weight : set the parameter C of class i to weight*C, for C-SVC (default 1)
-v n : n-fold cross validation mode
-q : quiet mode (no outputs)

and

Usage: [predicted_label, accuracy, decision_values/prob_estimates] = svmpredict(testing_label_vector, testing_instance_matrix, model, 'libsvm_options')
Parameters:
  model: SVM model structure from svmtrain.
  libsvm_options:
    -b probability_estimates: whether to predict probability estimates, 0 or 1 (default 0); one-class SVM not supported yet
Returns:
  predicted_label: SVM prediction output vector.
  accuracy: a vector with accuracy, mean squared error, squared correlation coefficient.
  prob_estimates: If selected, probability estimate vector.

Example code for training a linear SVM on a data set of four points with three features:

training_label_vector = [1 ; 1 ; -1 ; -1];
training_instance_matrix = [1 2 3 ; 3 4 5 ; 5 6 7; 7 8 9];
model = svmtrain(training_label_vector, training_instance_matrix, '-t 0');

Applying the resulting model to test data

testing_instance_matrix = [9 5 1; 2 9 5];
predicted_label = svmpredict(nan(2, 1), testing_instance_matrix, model)

results in

predicted_label =

    -1
    -1

You can also pass the true testing_label_vector to svmpredict so that it directly computes the accuracy; I here replaced the true labels by NaNs.


Please note that there is also a function svmtrain in Matlab's Statistics Toolbox which is incompatible with the one from LIBSVM – make sure you call the correct one.

0
votes

As @A.Donda answers, you don't have to transfer the data to 'libsvm' format, if you can do the training and predicting in the matlab.

When you want to do the training and predicting work in windows or linux, you have to make the data in 'libsvm' format.

From your mistake, I think you didn't give the label in every line of'data features'. You should add the label in front of the features in every line of the data.

matlab> SPECTF = csvread('SPECTF.train'); % read a csv file
matlab> features = SPECTF(:, :); % because there are no labels in your csv file
matlab> labels = [??];% to add the label as your plan 
matlab> features_sparse = sparse(features); % features must be in a sparse  matrix
matlab> libsvmwrite('SPECTFlibsvm.train', labels, features_sparse);

You should provide more about your data, so we can help you with the label. BTW, label data is usually set by the user at the beginning. And you can set the label data any integer to one kind of data as you like.