I am using this PHP routine to calc Pearson Correlation:
function correlation ($x,$y) {
$length = count($x);
$mean1 = array_sum($x)/$length;
$mean2 = array_sum($y)/$length;
$a = $b = 0;
$a2 = $b2 = 0;
$axb = 0;
for ($i = 0; $i < $length; $i++) {
$a = $x[$i]-$mean1;
$b = $y[$i]-$mean2;
$axb +=$a*$b;
$a2 += pow($a,2);
$b2 += pow($b,2);
}
if ($sqrt = sqrt($a2*$b2))
return $axb/$sqrt;
return 0;
}
When I test it for several conditions it returns 0 on exact matchs:
echo correlation([0,0,0,0,0],[0,0,0,0,0]); // Returns 0!!
echo correlation([0,0,0,0,0],[1,1,1,1,1]); // Returns 0!!
echo correlation([1,1,1,1,1],[1,1,1,1,1]); // Returns 0!!
echo correlation([0,0,0,0,0],[9,9,9,9,9]); // Returns 0!!
echo correlation([0,0,0,0,0],[0,1,2,3,4]); // Returns 0 OK
echo correlation([9,9,9,9,9],[0,1,2,3,4]); // Returns 0 OK
echo correlation([0,1,2,3,4],[0,1,2,3,4]); // Returns 1 OK
Why? and How to accomplish that? Thank you!
For info:
A Pearson correlation is a number between -1 and 1 that indicates the extent to which two variables are linearly related. The Pearson correlation is also known as the “product moment correlation coefficient” (PMCC) or simply “correlation”.
return 0;withreturn 1;—but I doubt it's mathematically correct. I think a better result would benull. - Álvaro González