I have two arrays that I would like to do a Pearson's Chi Square test (goodness of fit). I want to test whether or not there is a significant difference between the expected and observed results.
observed = [11294, 11830, 10820, 12875]
expected = [10749, 10940, 10271, 11937]
I want to compare 11294 with 10749, 11830 with 10940, 10820 with 10271, etc.
Here's what I have
>>> from scipy.stats import chisquare
>>> chisquare(f_obs=[11294, 11830, 10820, 12875],f_exp=[10749, 10940, 10271, 11937])
(203.08897607453906, 9.0718379533890424e-44)
where 203 is the chi square test statistic and 9.07e-44 is the p value. I'm confused by the results. p-value = 9.07e-44 < 0.05 therefore we reject the null hypothesis and conclude that there is a significant difference between the observed and expected results. This isn't correct because the numbers are so close. How do I fix this?
chi2_contingency
instead. – user707650