1
votes

I try to calculate the correlation matrix of a set of histogram vectors. But the result is a truncated version of what I (think) I want. I have 200 histograms by 32 bins each. The result from

correlation_matrix = corrcoef(set_of_histograms) 

is a 32 by 32 matrix.

I want to use this to calculate how my original histograms match up. (this by later using eigs and other stuff).

But which correlation method is right for this? I have tried "corrcoef" but there are "corr" and "cov" as well. Can't understand their differences by reading matlab help...

2

2 Answers

2
votes
correlation_matrix = corrcoef(set_of_histograms')

(Note the ')

1
votes

1) corrcoef treats every column as an observation, and calculates the correlations between each pair. I'm assuming your histograms matrix is 200x32; hence, in your case, every row is an observation. If you transpose your histograms matrix before running corrcoef, you should get the 200x200 result you're looking for:

[rho, p] = corrcoef( set_of_histograms' );

(' transposes the matrix)

2) cov returns the covariance matrix, not the correlation; while the covariance matrix is used in calculating the correlation, it is not the measure you're looking for.

3) As for corr and corrcoef, they have a few implementation differences between them. As long as you are only interested in Pearson's correlation, they are identical for your purposes. corr also has an option to calculate Spearman's or Kendall's correlations, which corrcoef does not have.