I'm trying to do a goodness-to-fit test against a Poisson on a series of observations using R. I'm counting how many people did a certain thing per minute, over 57 minutes. I never got any observations greater than 13, and i got the following data: (for the cases 0 to 13+ people):
observed = c(3/57, 4/57, 9/57, 7/57, 9/57, 8/57, 2/57, 3/57, 7/57, 2/57, 1/57, 0, 1/57, 1/57, 0)
meaning that 3 times i observed 0 people, 4 times 1 people, 9 times 2 people and so on (the last 0 means i never saw 14 or more people).
mn = 4.578947
cases = c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13)
estimated = c()
for (i in cases)(estimated <- c(estimated, dpois(i, lambda = mn)))
estimated <- c(estimated, (1-ppois(13, lambda=mn)))
where mn
is the mean obtained from the data.
Finally, i run
chisq.test(observed, p=estimated)
and i get:
Chi-squared test for given probabilities
data: observed
X-squared = 1.0182, df = 14, p-value = 1
Warning message:
In chisq.test(observed, p = estimated) :
Chi-squared approximation may be incorrect
I'm not well-versed in this area (neither statistics nor programming on R), but i have the idea that i'm not supposed to get a p-value of exactly 1.0. What am i doing wrong? (By the way: My code is most likely not optimal for what i'm trying to do, but i barely use R and it's not the focus of my work right now.)
expected frequencies >= 5
for each bin/ category of occurrence. explained in my answer below on how to achieve that. - Mankind_008