0
votes

I have a test dataset with a shape of (100, 49280)

I am trying to plot the data from an svm classification, based on the example in http://scikit-learn.org/0.18/auto_examples/classification/plot_classifier_comparison.html like so:

x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

So I print x_min, x_max, h, y_min, y_max:

-0.6541590243577957

7.5925517082214355

0.02

-0.9953588843345642

4.58396577835083

so far so good, according to what I have found in the docs I expect to get a mesh array with these values.

just to check I print xx and yy and get:

(279, 413)
(279, 413)

which seems fishy. and when I get to the line:

Z = clf.decision_function(np.c_[xx.ravel(), yy.ravel()])

I get the error: ValueError: X.shape[1] = 2 should be equal to 49280, the number of features at training time

I have never used numpy's meshgrid function but everything I have read in the docs seems normal and well documented as most of numpy is. I am sure that I am missing something silly but for the life of me I can't find it. Does anyone have an idea of where I went wrong? I will try to make a more informative title for the question once I figure out what the problem is too.

1

1 Answers

0
votes

The np.arange function does not fit to this scenario. You probably want to use np.linspace as follows

xx, yy = np.meshgrid(np.linspace(x_min, x_max), np.linspace(y_min, y_max))

The docpage for np.linspace

https://docs.scipy.org/doc/numpy/reference/generated/numpy.linspace.html