Fancy Binning Operation - How to vectorize a relative intra-bin-wise operation?

Question

I've decided to get a little wild this evening and party with histogram bins to operate on some financial data I'm analyzing.

It appears the party has been pooped on, though, as the manner through which I would like to apply my 'intra-bin' operation is not readily apparent, neither through research nor playing around, and proving bothersome.

The Desire: I would like to use the 'binning' index within a column to perform some kind of row-wise 'intra-bin' operation where said operation will make a relative reference to the first element of its own bin. Please consider the following single bin example where the operation is to take a difference

A=

The relative operation will take the difference between all elements of column 2 and the 1st element of column 2 such that

bin_differencing_function(A)=

1   10.4    0.0
1   10.6    0.2
1   10.3    -0.1
1   10.2    -0.2

Now, still more convenient would be to be able to feed bin_differencing_function(A) a dual column matrix with an arbitrary number of bins such that if

A=

better_bin_differencing_function(A)=

1   10.4    0.0
1   10.6    0.2
1   10.3    -0.1
1   10.2    -0.2
2   10.2    0.0
2   10.6    0.4
2   10.8    0.6
2   10.8    0.6
3   11.0    0.0
3   10.8    -0.2
3   10.8    -0.2
3   10.8    -0.2

Most convenient would be to be able to feed better_bin_differencing_function(A) a dual column matrix with an arbitrary number of bins where the bin length may not be constant such that if

A=

best_bin_differencing_function(A)=

1   10.4    0.0
1   10.6    0.2
1   10.3    -0.1
2   10.2    0.0
2   10.6    0.4
2   10.8    0.6
2   10.8    0.6
2   10.7    0.5
3   11.0    0.0
3   10.8    -0.2

The big desire is to create a piece of code that takes advantage of vectorization (if possible) to operate on many bins who's lengths will vary between 1 and 200. I'm thinking a play on accumarray may do the trick such that

accumarray(A(:,1),A(:,2),[],@(x) fun(x))

Where fun(x) is a function with a for loop.

I'm running MATLAB 7.10.0.499 (R2010a) on Windows 7. Sorry the examples made this query so long.

my gut feeling is that a for loop combined with bsxfun will be a fast solution here... — bla

zelusp zelusp · Accepted Answer · 2014-09-01T04:37:38

Alright stackoverflow, I figured it out! Turns out I was right about using accumarray

Matrices B,C, and A are only defined within the function for validation convenience. Matrix A would be passed as follows: best_bin_differencing_function(A)

function differenced_bins=best_bin_differencing_function()
B=[1 1 1 2 2 2 2 2 3 3]';
C=[10.4 10.6 10.3 10.2 10.6 10.8 10.8 10.7 11.0 10.8]';
A=[B,C]; 
differenced_bins=cell2mat(accumarray(A(:,1),A(:,2),[],@(x) {fun(x)}));
end

function y=fun(var)
    y=zeros(length(var),1);
    for i=1:length(var)
        y(i)=var(i)-var(1);
    end
end

I'm going to run a stress test between this and @Divakar's response and will up-vote accordingly. Thank you all for taking a look!

Fancy Binning Operation - How to vectorize a relative intra-bin-wise operation?

2 Answers