1
votes

In Logistic Regression for binary classification, while using predict(), how does the classifier decide for the class (1/0)?

Is it based on the probability threshold, if >0.5 then 1 else 0? If so, can this threshold be manually changed?

I know we get probabilities from predict_prob(), but i was curious about predict() function!

1
It picks the class with the highest probability.pault

1 Answers

1
votes

Logistic Regression, like other classification models, returns a probability for each class. Being a binary predictor, it has only two classes.

From the source code, predict() returns the class with the highest class probability.

def predict(self, X):
    """Predict class labels for samples in X.
    Parameters
    ----------
    X : {array-like, sparse matrix}, shape = [n_samples, n_features]
        Samples.
    Returns
    -------
    C : array, shape = [n_samples]
        Predicted class label per sample.
    """
    scores = self.decision_function(X)
    if len(scores.shape) == 1:
        indices = (scores > 0).astype(np.int)
    else:
        indices = scores.argmax(axis=1)
    return self.classes_[indices]

So yes, in this case it returns the class with a probability greater than 50%, since the sum of the class probabilities = 1.