P value returned by scipy.stats.wilcoxon
has nothing to do with the distribution of x
or y
, nor the difference between them. It is determined by the Wilcoxon test statistic (W as it in http://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test, or T as in scipy
), which is assumed to follow a normal distribution. If you check the source (in ~python_directory\site-packages\scipy\stats\morestats.py), you will find the last few lines of def wilcoxon()
:
se = sqrt(se / 24)
z = (T - mn) / se
prob = 2. * distributions.norm.sf(abs(z))
return T, prob
and:
mn = count*(count + 1.) * 0.25
se = count*(count + 1.) * (2. * count + 1.)
Where count
is the number of non-zero difference between x
and y
.
So, to get one-side p value, you just need prob/2.
or 1-prob/2.
Examples:
In Python
:
>>> y1=[125,115,130,140,140,115,140,125,140,135]
>>> y2=[110,122,125,120,140,124,123,137,135,145]
>>> ss.wilcoxon(y1, y2)
(18.0, 0.5936305914425295)
In R
:
> wilcox.test(y1, y2, paired=TRUE, exact=FALSE, correct=FALSE)
Wilcoxon signed rank test
data: y1 and y2
V = 27, p-value = 0.5936
alternative hypothesis: true location shift is not equal to 0
> wilcox.test(y1, y2, paired=TRUE, exact=FALSE, correct=FALSE, alt='greater')
Wilcoxon signed rank test
data: y1 and y2
V = 27, p-value = 0.2968
alternative hypothesis: true location shift is greater than 0
alternative
to'greater'
or'less'
. Therefore, you do not have to compute it by yourself. – So S