1
votes

I have a feature matrix of size ~1M x 3 where the columns are doc#,wordID#,wordcount

What's a fast way in Matlab to rearrange this feature matrix so it is instead of size #docs x # unique words i.e.

(length(unique(featurematrix(:,1))) x length(unique(featurematrix(:,2)))

so that each row instead represents an entire document, each column represents a different word, and the values are the wordcounts from the 3rd column of the original matrix?

I started writing a bunch of loops, but had the feeling there's probably some short idiomatic way to do this already built-in to Matlab.

1

1 Answers

2
votes

You can actually use accumarray to accomplish this

data = [1, 1, 1;
        1, 2, 2;
        1, 5, 3;
        2, 1, 4;
        2, 3, 5];

result = accumarray(data(:,1:2), data(:,3))
%   1     2     0     0     3
%   4     0     5     0     0

Alternately you could use sparse

result = full(sparse(data(:,1), data(:,2), data(:,3)))