1
votes

I have a simple example dataset below:

a = 

 1 1 1 NaN NaN
 1 1 1 NaN NaN
 1 1 1 1 NaN
 1 1 1 1 1
 1 1 1 1 1 

I want to work out the average cumulative value per row. However, cumsum gives the following output:

cumsum(a)

1 1 1 NaN NaN
2 2 2 NaN NaN
3 3 3 1 NaN
4 4 4 2 1
5 5 5 3 2

Then calculating a row mean gives:

nanmean(a,2)

1
2
2.5
3
4

I want to be able to account for the fact that different columns start later i.e. the row mean values for rows (3:5) are reduced with respect to their true values due to low numbers in columns (4:5).

I want to achieve this by replacing the last NaN above the first numeric element in each column in the matrix (a) with the mean of the other columns in that row in the cumulative matrix.This would need to be done iteratively to reflect the changing values in the cumulative matrix. So the new matrix would first look as follows:

(a)

 1 1 1 NaN NaN
 1 1 1 *2* NaN
 1 1 1 1 NaN
 1 1 1 1 1
 1 1 1 1 1 

which would lead to:

cumsum(a)

1 1 1 NaN NaN
2 2 2 2 NaN
3 3 3 3 NaN
4 4 4 4 1
5 5 5 5 2   

and then iteratively, (a) would equal:

(a)

 1 1 1 NaN NaN
 1 1 1 2 NaN
 1 1 1 1 *3*
 1 1 1 1 1
 1 1 1 1 1     

which would lead to:

cumsum(a)

1 1 1 NaN NaN
2 2 2 2 NaN
3 3 3 3 3
4 4 4 4 4
5 5 5 5 5   

which would give the desired row means values as:

nanmean(a,2)

1
2
3
4
5
1

1 Answers

0
votes

There may be a way to further vectorise this. However, I think that because each row depends on the previous values, you have to update the matrix row-by-row as follows:

% Cycle through each row in matrix
for i = 1:length(a)

    if i > 1 

        % This makes elements equal to the sum of themselves and above element
        % Equivalent outcome to cumsum 
         a(i,:) = a(i,:) + a(i-1,:);

    end

    % Replace all NaN values in the row with the average of the non-NaN values
    a(i,isnan(a(i,:))) = mean(a(i,~isnan(a(i,:))));

end

This replicates your input and output examples. It doesn't replicate all your iterative steps, it in fact uses many less steps, only 5 (number of rows) for entire operation.

Edit: equally,

for i = 1:length(a)

    % Replace all NaN values in the row with the average of the non-NaN values
    a(i,isnan(a(i,:))) = mean(a(i,~isnan(a(i,:))));

end    

a = cumsum(a);