1
votes

I need to get p-value through Chi-square. My programm is:

from scipy.stats import chisquare

c = chisquare([10,4,7,5],ddof=[0,1,2,3])

print(c)

The result is:

Power_divergenceResult(statistic=3.2307692307692308, pvalue=array([ 0.35739509,  0.19881419,  0.07226674,         nan]))

When I try to get p-value using table of Chi-squared values( for example from this site https://www.medcalc.org/manual/chi-square-table.php ), results are different. In this example using python p-value with degrees of freedom = 1(ddof=0) is 0.35739509 but using table p-value is 0.01. Could you please explain why results are different?

1

1 Answers

2
votes

The function chisquare performes the Chi-squared hypothesis test but the table is about the Chi-square distribution.

If you want to work with the distribution you need to use scipy.stats.chi2. In particular, to replicate values from the table:

import scipy as sp

p = 0.1
df = 5

x = sp.stats.chi2.ppf(1-p, df=df)
print(x)  # 9.23635689978

And to get the p-value for a given x and degrees of freedom:

p = 1 - sp.stats.chi2.cdf([10,4,7,5], df=[0,1,2,3])
print(p)  
# [        nan  0.04550026  0.03019738  0.17179714]

Note that the table defines p as the integral over the probability density function from x to infinity. The cumulative density function in scipy is the integral from 0 to x. Therefore, p = 1 - cdf.