A Fisher exact test is often used for over representation analysis of gene lists in a pathway. Consider the following example of a contingency table:
in pathway
Y N
in gene list Y 90 110 | 200
N 10 790 | 800
------------------
100 900 | 1000
There are essentially two ways to do a Fisher test based over representation analysis in R. The first is to use fisher.test (which takes the contingency matrix as input)
fisher.test(matrix(c(90,10,110,790), nrow = 2), alternative = 'greater')$p.value
[1] 1.486473e-59
The second is to use phyper (Meng's notes give an excellent explanation on how to use phyper, including why the "-1", and what q, m, n, k exactly mean):
phyper(q=90-1, m=100, n=900, k=200, lower.tail = FALSE)
[1] 1.486473e-59
My question: why does this differ from:
1 - phyper(q=90-1, m=100, n=900, k=200, lower.tail = TRUE)
[1] 0