1
votes

I have a dataset stored in a similar manner to the follwing example:

clear all
Year = cell(1,4);
Year{1} = {'Y2007','Y2008','Y2009','Y2010','Y2011'};
Year{2} = {'Y2005','Y2006','Y2007','Y2008','Y2009'};
Year{3} = {'Y2009','Y2010','Y2011'};
Year{4} = {'Y2007','Y2008','Y2009','Y2010','Y2011'};

data = cell(1,4);
data{1} = {rand(26,1),rand(26,1),rand(26,1),rand(26,1),rand(26,1)};
data{2} = {rand(26,1),rand(26,1),rand(26,1),rand(26,1),rand(26,1)};
data{3} = {rand(26,1),rand(26,1),rand(26,1)};
data{4} = {rand(26,1),rand(26,1),rand(26,1),rand(26,1),rand(26,1)};

Where each cell in 'Year' represents the time where each measurement in 'data' was collected. For example, the first cell in Year ('Year{1}') contains the year where each measurements in 'data{1}' was collected so that data{1}{1} was collected in 'Y2007', data{1}{2} in 'Y2008'...and so on

I am now trying to find the correlation between each measurement with the corresponding (same year) measurement from the other locations. For example for the year 'Y2007' I would like to find the correlation between data{1}{1} and data{2}{3}, then data{1}{1} and data{4}{1}, and then data{2}{3} and data{4}{1} and so on for the remaining years.

I know that the corrcoef command should be used to calculate the correlation, but I cannot seem to get to the stage where this is possible. Any advice would be much appreciated.

1

1 Answers

0
votes

I assume one year appears only once per cell. Here is a code I end up with (see comments for explanations):

yu = unique([Year{:}]); %# cell array of unique year across all cells
cc = cell(size(yu)); %# cell array for each year
for y = 1:numel(yu)
    %# which cells have y-th year
    yuidx = cellfun(@(x) find(ismember(x,yu{y})), Year, 'UniformOutput',0);
    yidx = find(cellfun(@(x) ~isempty(x), yuidx, 'UniformOutput',1));
    if numel(yidx) <= 1
        continue
    end
    %# find indices for y-th year in each cell
    yidx2 = cell2mat(yuidx(yidx));
    %# fill matrix to calculate correlation
    ydata = zeros(26,numel(yidx)); 
    for k = 1:numel(yidx)
        ydata(:,k) = data{yidx(k)}{yidx2(k)};
    end
    %# calculate correlation coefficients
    cc{y} = corr(ydata);
end

yu will have list of all years. cc will contain correlation matrices for each year. If you want you can also keep yidx (if you make it a cell array changing the code accordingly).