7
votes

I just performed a factorial ANOVA, followed by the TukeyHSD post-test. Some of my adjusted P values from the TukeyHSD output are 0.0000000. Can these P values really be zero? Or is this a rounding situation, and my true P value might be something like 1e-17, that is rounded to 0.0000000.

Are there any options for the TukeyHSD() function in R that will give output P-values that contain exponents?

Here is a snippet of my output:

TukeyHSD(fit)

  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = lum ~ cells * treatment)

$`cells:treatment`
                    diff         lwr          upr     p adj
NULL:a-KR:a     -266.5833333 -337.887800 -195.2788663 0.0000000
WT:a-KR:a       -196.3333333 -267.637800 -125.0288663 0.0000000
KR:ar-KR:a        83.4166667   12.112200  154.7211337 0.0053485
NULL:ar-KR:a    -283.5000000 -354.804467 -212.1955330 0.0000000
WT:ar-KR:a      -196.7500000 -268.054467 -125.4455330 0.0000000
KR:e-KR:a       -219.0833333 -290.387800 -147.7788663 0.0000000
NULL:e-KR:a     -185.0833333 -256.387800 -113.7788663 0.0000000
WT:e-KR:a        -96.1666667 -167.471134  -24.8621996 0.0003216
1
In your particular case it's impossible for a p-value to be 0. But the p-value can be so small that the computer can't tell it apart from 0. Or it might be that the print method chooses to report the tiny p-value as 0 instead of doing something like '<.000001'. It is theoretically possible for p-values to be 0 in certain situations but this is not one of those cases.Dason
Use options(digits=22) to show all decimal places. But be aware that numbers smaller than 2e-16 cannot be reliably distinguished from each other or from 0.bdemarest
a reproducible example ( tinyurl.com/reproducible-000 ), or the results of dput(fit), would sure help ...Ben Bolker
Thanks everyone for your help!! @BenBolker provided a nice reproducible example in his answer below that clearly shows what's going on (thus I won't provide one).Todd

1 Answers

10
votes

EDIT: see warning below about resolution of Tukey p-values!!

dd <- data.frame(y=c(1:10,1001:1010),f=rep(c("A","B"),each=10))
fit <- aov(y~f,data=dd)

The printed p-value is zero:

(tt <- TukeyHSD(fit))
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = y ~ f, data = dd)
## 
## $f
##     diff      lwr      upr p adj
## B-A 1000 997.1553 1002.845     0

But looking at the (abbreviated) output of str() shows there's more information there:

str(tt)

## List of 1
##  $ f: num [1, 1:4] 1.00e+03 9.97e+02 1.00e+03 2.62e-14
##   ..- attr(*, "dimnames")=List of 2
## 

You can extract the value yourself:

tt$f[,"p adj"]
## [1] 2.620126e-14

Or as noted in the comments, print(tt,digits=15) will work ...

WARNING

I decided to dig a little deeper and noticed in digging through the code of TukeyHSD.aov() that it relies on ptukey(), which in its "Examples" section warns that "the precision may not be more than about 8 digits". In particular, once the t-statistic is above about 30, the p-value maxes out (mins out?) at 2.62e-14 ...

zval <- 10^seq(1,6,length=100)
pval <- ptukey(zval,2,18,lower.
par(las=1,bty="l")
plot(zval,pval,log="xy",type="l")

enter image description here

The bottom line is that you can't distinguish among p-values this small at all. You may need to rethink your strategy ...