0
votes

When we run a logistic regression in Scikit-Learn, we don't see the p-value (even though there is some ways of getting this value). What I want to know is how the p-value works in this regression using this library. Are all the variables considered even if the p-value is above some threshold? If not, what is the threshold?

For instance, suppose we have two variables, x1 and x2. We run the following logistic regression:

clf = LogisticRegression().fit(df[['x1','x2']], df['y'])

After running this regression, we get the coefficients:

clf.coef_

If the p-value of x1 is 0.8, will x1 coefficient appear in the output? If not, what is the threshold considered by the library: 0.01, 0.5 or 0.1?

1

1 Answers

1
votes

scikit-learn's LogisticRegression does not have the functionality by default, its just not implemented, no p-values are computed and output. p-values are generally not used in Machine Learning, its more of a (frequentist) statistics view.

There are other questions in this site that answer how to compute p-values, for example this one, and this adds to the evidence that scikit-learn does not do this in current verisons.