I have a skewed dataset (5,000,000 positive examples and only 8000 negative [binary classified]) and thus, I know, accuracy is not a useful model evaluation metric. I know how to calculate precision and recall mathematically but I am unsure how to implement them in python code.
When I train the model on all the data I get 99% accuracy overall but 0% accuracy on the negative examples (ie. classifying everything as positive).
I have built my current model in Pytorch with the criterion = nn.CrossEntropyLoss()
and optimiser = optim.Adam()
.
So, my question is, how do I implement precision and recall into my training to produce the best model possible?
Thanks in advance