I’m training linear model on MNIST dataset, but I wanted to train only on one digit that is 4. How do I choose my X_test,X_train, y_test, y_train?
0
votes
2 Answers
0
votes
Your classifier needs to learn to discriminate between sets of different classes. If you only care about digit 4, you should split your training and testing set into:
- Class 4 instances
- Not class 4 instances: union of all other digits
Otherwise the train/test split is still the typical one, where you want to have no overlap.
0
votes
If you only need to recognize 4s it's a binary classification problem, so you just need to create a new target variable: Y=1 if class is 4, Y=0 if class is not 4.
Data will be a bit unbalanced but it should not be an issue!