1
votes

I am looking for an easy way to obtain the column-wise average of a subset of values in a matrix (indexed by a logical matrix), preferably without having to use a loop. The problem that I am experiencing is that because the number of values in each column is different, matlab collapses the values-matrix and its column-wise mean becomes the total mean (of the entire matrix). Is there a specific function or easy workaround for this problem? See below for an example.

    %% define value matrix and logical indexing matrix

    values=[1 2 3 4; 5 6 7 8; 9 10 11 12];
    indices=[1 0 0 1; 0 1 0 1; 1 1 0 0];
    indices=indices==1; %convert to logical

    %% calculate column-wise average

    mean(values(indices),1)
1
Why without loops? I'm sure that, for array sizes where the implementation matters (i.e. very large arrays that take a non-trivial time to process) the loop approach will be the fastest. Loops are no longer slow in MATLAB, and they haven't been in at least 15 years. It's about time people drop this "avoid loops at all cost" mentality.Cris Luengo

1 Answers

4
votes

accumarray-based approach

Use the column index as a grouping variable for accumarray:

[~, col] = find(indices);
result = accumarray(col(:), values(indices), [size(values,2) 1], @mean, NaN).';

Note that:

  • The second line uses (:) to force the first input to be a column vector. This is needed because the first line may produce col as a row or column vector, depending on the size of indices.

  • NaN is used as fill value. This is specified as the fifth input to accumarray.

  • The third input to accumarray defines output size. It is necessary to specify it explicitly (as opposed to letting accumarray figure it out) in case the last columns of indices only contain false.

Hacky approach

Multiply and divide element-wise by indices. That will turn unwanted entries into NaN, which can then be ignored by mean using the 'omitnan' option:

result = mean(values.*indices./indices, 1, 'omitnan');

Manual approach

Multiply element-wise by indices, sum each column, and divide element-wise by the sum of each column of indices:

result = sum(values.*indices, 1) ./ sum(indices, 1);