I want to train a SVM to perform a classification of samples. I have a csv file with me that has 3 columns with headers: feature 1,feature 2, class label and 20 rows(= number of samples).
Now I quote from the Scikit-Learn documentation " As other classifiers, SVC, NuSVC and LinearSVC take as input two arrays: an array X of size [n_samples, n_features] holding the training samples, and an array y of class labels (strings or integers), size [n_samples]:"
I understand that I need to obtain two arrays(one 2d & one 1d array) in order to feed data into the SVM. However I am unable to understand how to obtain the required array from the csv file. I have tried the following code
import numpy as np
data = np.loadtxt('test.csv', delimiter=',')
print data
However it is showing an error "ValueError: could not convert string to float: ��ࡱ�"
There are no column headers in the csv. Am I making any mistake in calling the function np.loadtxt or should something else be used?
Update: Here's how my .csv file looks like.
12 122 34
12234 54 23
23 34 23
delimiter
param so:data = np.loadtxt('test.csv')
should work – EdChumdelimiter='\t'
. – Warren Weckesser