2
votes

I want to get an estimate on how well the classifiers would work on an imbalance dataset of mine. When I try to fit KNN classifier from sklearn it learns nothing for the minority class. So what I did was I fit the classifier with k = R (where r is the imbalance ratio 1: R) and I predict probabilities for each test point and assign a point to minority class if the probability output of the classifier for the minority class is great than R (where r is the imbalance ratio 1: R). I do this to get an estimate of how the classifier performs(F1-score). I don't need the classifier in production. Is what I'm doing right?

1
Welcome to SO. The way around this is frequency based resampling. Possible duplicate of this question.Sıddık Açıl
I also worked with imbalance data once, that time I used SMOTE and generated minority class examples synthetically so that the ratio of majority and minority class data becomes 1:1. you can check SMOTE here imbalanced-learn.readthedocs.io/en/stable/generated/…Vikas Gautam
Is there any way without re-sampling ??Nitin Shravan
I don't know honestly, but as an alternative, you can choose randomly majority class data from dataset such that ratio is always 1:1 with minority class.Vikas Gautam
If you are not constrained on the classier to use, you could try a classifier with parameters like decision trees or random forest where you get to specify the class weights by yourself. Doing so your model will start picking up the minority classes as well. Please refer to stackoverflow.com/questions/37522191/…, for the implementation details.Parthasarathy Subburaj

1 Answers

0
votes

Since you have mentioned in the comments that you dont want to use resampling, the one way out is batching. Create multiple dataset from your majority class so that they will be 1:1 ratio with minority class. Train multiple models with each model getting one part of the majority set and all of the minority. Make a prediction with all the models and take a vote from them and decide your final outcome.

But I would suggest using SMOTE over this method.