Getting p-value=1 on a Goodness to fit Chi squared test

Question

I'm trying to do a goodness-to-fit test against a Poisson on a series of observations using R. I'm counting how many people did a certain thing per minute, over 57 minutes. I never got any observations greater than 13, and i got the following data: (for the cases 0 to 13+ people):

observed = c(3/57, 4/57, 9/57, 7/57, 9/57, 8/57, 2/57, 3/57, 7/57, 2/57, 1/57, 0, 1/57, 1/57, 0)

meaning that 3 times i observed 0 people, 4 times 1 people, 9 times 2 people and so on (the last 0 means i never saw 14 or more people).

mn = 4.578947 
cases = c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
estimated = c()
for (i in cases)(estimated <- c(estimated, dpois(i, lambda = mn)))
estimated <- c(estimated, (1-ppois(13, lambda=mn)))

where mn is the mean obtained from the data. Finally, i run

 chisq.test(observed, p=estimated)

and i get:

 Chi-squared test for given probabilities

data:  observed
X-squared = 1.0182, df = 14, p-value = 1

Warning message:
In chisq.test(observed, p = estimated) :
  Chi-squared approximation may be incorrect

I'm not well-versed in this area (neither statistics nor programming on R), but i have the idea that i'm not supposed to get a p-value of exactly 1.0. What am i doing wrong? (By the way: My code is most likely not optimal for what i'm trying to do, but i barely use R and it's not the focus of my work right now.)

In addition to use counts data for observed frequencies, you need to have expected frequencies >= 5 for each bin/ category of occurrence. explained in my answer below on how to achieve that. — Mankind_008

Scransom Scransom · Accepted Answer · 2018-06-18T04:41:52

Your observed values should be counts, not proportions:

> chisq.test(observed*57, p=estimated)

    Chi-squared test for given probabilities

data:  observed * 57
X-squared = 58.036, df = 14, p-value = 2.585e-07

Per the R help file for chisq.test:

If x is a matrix with one row or column, or if x is a vector and y is not given, then a goodness-of-fit test is performed (x is treated as a one-dimensional contingency table). The entries of x must be non-negative integers.

(Emphasis mine)

You can test this with some of the example code in the manual

How it should be done:

> x <- c(89,37,30,28,2)
> p <- c(0.40,0.20,0.20,0.19,0.01)
> chisq.test(x, p = p)

    Chi-squared test for given probabilities

data:  x
X-squared = 5.7947, df = 4, p-value = 0.215

Warning message:
In chisq.test(x, p = p) : Chi-squared approximation may be incorrect

And making the same mistake as you have:

> chisq.test(x/sum(x), p = p)

    Chi-squared test for given probabilities

data:  x/186
X-squared = 0.031154, df = 4, p-value = 0.9999

Warning message:
In chisq.test(x/186, p = p) : Chi-squared approximation may be incorrect

Getting p-value=1 on a Goodness to fit Chi squared test

2 Answers