I have instantiated a SVC object using the sklearn library with the following code:
clf = svm.SVC(kernel='linear', C=1, cache_size=1000, max_iter = -1, verbose = True)
I then fit data to it using:
model = clf.fit(X_train, y_train)
Where X_train is a (301,60) and y_train is (301,) ndarray (y_train consisting of class labels "1", "2" and "3").
Now, before I stumbled across the .score() method, to determine the accuracy of my model on the training set i was using the following:
prediction = np.divide((y_train == model.predict(X_train)).sum(), y_train.size, dtype = float)
which gives a result of approximately 62%.
However, when using the model.score(X_train, y_train) method I get a result of approximately 83%.
Therefore, I was wondering if anyone could explain to me why this should be the case because as far as I understand, they should return the same result?
ADDENDUM:
The first 10 values of y_true are:
- 2, 3, 1, 3, 2, 3, 2, 2, 3, 1, ...
Whereas for y_pred (when using model.predict(X_train)), they are:
- 2, 3, 3, 2, 2, 3, 2, 3, 3, 3, ...
y_true
andy_pred
values)? – elyase