Standardized data of SVM - Scikit-learn/ Python

Question

This is for an assignment where the SVM methods has to be used for model accuracy.

There were 3 parts, wrote the below code

import sklearn.datasets as datasets
import sklearn.model_selection as ms
from sklearn.model_selection import train_test_split


digits = datasets.load_digits();
X = digits.data
y = digits.target

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=30, stratify=y)

print(X_train.shape)
print(X_test.shape)

from sklearn.svm import SVC
svm_clf = SVC().fit(X_train, y_train)
print(svm_clf.score(X_test,y_test))

But after this, the question is as below

Perform Standardization of digits.data and store the transformed data in variable digits_standardized.

Hint : Use required utility from sklearn.preprocessing. Once again, split digits_standardized into two sets names X_train and X_test. Also, split digits.target into two sets Y_train and Y_test.

Hint: Use train_test_split method from sklearn.model_selection; set random_state to 30; and perform stratified sampling. Build another SVM classifier from X_train set and Y_train labels, with default parameters. Name the model as svm_clf2.

Evaluate the model accuracy on testing data set and print it's score.

On top of the above code, tried writing this, but seems to be failing. Can anyone help on how the data can be standardized.

std_scale = preprocessing.StandardScaler().fit(X_train)
X_train_std = std_scale.transform(X_train)
X_test_std  = std_scale.transform(X_test)

svm_clf2 = SVC().fit(X_train, y_train)
print(svm_clf.score(X_test,y_test))

Noob coder Noob coder · Accepted Answer · 2020-09-30T11:14:23

Tried the below. Seems to be working.

import sklearn.datasets as datasets
import sklearn.model_selection as ms
from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler


digits = datasets.load_digits();


X = digits.data
scaler = StandardScaler()
scaler.fit(X)
digits_standardized = scaler.transform(X)

y = digits.target

X_train, X_test, y_train, y_test = train_test_split(digits_standardized, y, random_state=30, stratify=y)

#print(X_train.shape)
#print(X_test.shape)


from sklearn.svm import SVC
svm_clf2 = SVC().fit(X_train, y_train)
print("Accuracy ",svm_clf2.score(X_test,y_test))

Standardized data of SVM - Scikit-learn/ Python

2 Answers