Confidence Probability for Binary Machine Learning Classification

Question

When using SKlearn and getting probabilities with the predict_proba(x) function for a binary classification [1, 0] the function returns the probability that the classification falls into each class. example [.8, .34].

Is there a community adopted standard way to reduce this down to a single classification confidence which takes all factors into consideration?

Option 1) Just take the probability for the classification that was predicted (.8 in this example)

Option 2) Some mathematical formula or function call which which takes into consideration all of the different probabilities and returns a single number. Such a confidence approach could take into consideration who close the probabilities of the different classes and return a lower confidence if there is not much separation between the different classes.

In your example - shouldn't the classes probabilities sum to 1? — Marcin Możejko

hashcode55 hashcode55 · Accepted Answer · 2016-07-24T17:28:30

Theres no standard of of doing it. But what you can do is vary the threshold. What I exactly mean is if you use predict instead it throws out a binary out classifying your dataset, what its doing is taking 0.5 as a threshhold for predicting. Like if the probability of classifying in 1 is >0.5 classify it as 1 and 0 if <=0.5. But this can lead to a bad f1-score in some cases.

So, the approach should be to vary the threshhold and and choose one which yields maximum f1-score or any other metric you want to use as a score function. ROC(Receiver operating characteristic)curves are meant for this purpose only. And infact, the motive behind sklearn for giving out the class probabilities for this only, to let you choose the best threshhold.

A very nice example is predicting whether the patient has cancer or not. So you have to choose your threshhold wisely, if you choose it high you'll might be getting false-negatives a lot or if you choose it low you might get false-positives a lot. So you just choose the threshold according to your needs (as its better to get more false-positives).

Hope it helps!

Confidence Probability for Binary Machine Learning Classification

1 Answers