I want to get an estimate on how well the classifiers would work on an imbalance dataset of mine. When I try to fit KNN classifier from sklearn it learns nothing for the minority class. So what I did was I fit the classifier with k = R (where r is the imbalance ratio 1: R) and I predict probabilities for each test point and assign a point to minority class if the probability output of the classifier for the minority class is great than R (where r is the imbalance ratio 1: R). I do this to get an estimate of how the classifier performs(F1-score). I don't need the classifier in production. Is what I'm doing right?
2
votes
1 Answers
0
votes
Since you have mentioned in the comments that you dont want to use resampling, the one way out is batching. Create multiple dataset from your majority class so that they will be 1:1 ratio with minority class. Train multiple models with each model getting one part of the majority set and all of the minority. Make a prediction with all the models and take a vote from them and decide your final outcome.
But I would suggest using SMOTE over this method.
SMOTE
and generated minority class examples synthetically so that the ratio of majority and minority class data becomes1:1
. you can check SMOTE here imbalanced-learn.readthedocs.io/en/stable/generated/… – Vikas Gautam1:1
with minority class. – Vikas Gautam