How to get p-value and confident interval in LogisticRegression with sklearn?

Question

I am building a multinomial logistic regression with sklearn (LogisticRegression). But after it finishes, how can I get a p-value and confident interval of my model? It only appears that sklearn only provides coefficient and intercept.

Thank you a lot.

score 15 · Answer 1 · edited May 23 '17 at 12:38

15

The short answer is that sklearn LogisticRegression does not have a built in method to calculate p-values. Here are a few other posts that discuss solutions to this, however.

https://stackoverflow.com/questions/27928275/find-p-value-significance-in-scikit-learn-linearregression

https://stackoverflow.com/questions/22306341/python-sklearn-how-to-calculate-p-values

edited May 23 '17 at 12:38

Community

1

answered Nov 28 '16 at 17:23

Hobbes

1,439
9
15

score 14 · Answer 2 · answered Nov 28 '16 at 19:00

One way to get confidence intervals is to bootstrap your data, say, $B$ times and fit logistic regression models $m_i$ to the dataset $B_i$ for $i = 1, 2, ..., B$. This gives you a distribution for the parameters you are estimating, from which you can find the confidence intervals.

Lucas Morin · Answer 3 · 2020-03-07T21:24:15.147

This is still not implemented and not planned as it seems out of scope of sklearn, as per Github discussion #6773 and #13048.

However, the documentation on linear models now mention that (P-value estimation note):

It is theoretically possible to get p-values and confidence intervals for coefficients in cases of regression without penalization.
The statsmodels package natively supports this.
Within sklearn, one could use bootstrapping.

It appears that it is possible to modify the LinearRegression class to calculate p-values from linear algebra, as per this Github code.

How to get p-value and confident interval in LogisticRegression with sklearn?

3 Answers3

Linked