0
votes

I am posing an interesting and useful question that needs to be carried out in MATLAB. It is about efficiency of programming by avoiding using Loops"

Assume a matrix URm whose columns are products and rows are people. The matrix entries are rating of people to these products, and this matrix is sparse as each person normally rates only few products.

 URm  [n_u, n_i]

Another matrix of interest is F, which contains attribute for each of the products and the attribute is of fixed length:

 F  [n_f,n_i]

We divide the URm into two sub-matrices randomly: URmTrain and URmTest where the former is used for training the system and the latter for test. These two matrices have similar rows (users) but they could have different number of columns (products).

We can find the similarity between items very fast using pdist() or Matrix transpose:

 S = F * F' ; 

For each row (user) in URmTest:

URmTestp = zeros(size(URmTest));
u = 1 ;   %% Example user 1

for i = 1 : size(URmTest,2)
indTrain = find(URmTrain(u,:)) ;   % For each user, search for items  in URmTrain that have been rated by the the user (i.e. the have a rating greater than zero)
  for j = 1 : length(indTrain)
    URmTestp(u,i) = URmTestp(u,i) + S(i,indTrain(j))*URmTrain(u,indTrain(j))
  end
end

where URmp is the predicted version of URm and we can compute an error on how good our prediction has been.

Example

Lets's make a simple example. Let's assume the items user 1 has rated items 3 , 5 and 17:

indTrain = [3 5 17]

For each item j in URmTest, I want to predict the rating using the following formula:

URmTestp(u,j) = S(j,3)*URmTrain(u,3) + S(j,5)*URmTrain(u,5) + S(j,17)*URmTrain(u,17)

Once completed this process needs to be repeated for all users.

As URm is typically very big, I prefer options which use least amount of 'loops'. We may be able to take advantage of bsxfun but I am not sure if we can.

Please suggest me ides that can help on accelerating this process as rapid as possible. Thank you

1
I suggest that you add a working loop-based example, that would explain it much more clearly what you wish to achieve. - Andras Deak
Andras, thanks for your comment. If we can answer this question more than 50% of the problem is solved. Just consider the example I gave, where indTrain is variable length but indTest is of length 1. Can we write the prediction formula avoiding loop? - Yas
I might have been a bit passive-agressive with my previous comment. Here's the direct version: I have no idea what you're exactly trying to do. I'd gladly vectorize it for you, but for that you need to specify your problem in clear terms. As far as programming questions go, code is as clear as it gets. - Andras Deak
Getting better:) If s_ij is in fact dependent on i and j, then put it in the inner loop as s_ij = F(:,indTest(i)) .* F(:,indTrain(j)). - Andras Deak
Also, I suspect that every URmp(u,i) should in fact be URmp(u,indTest(i)). - Andras Deak

1 Answers

1
votes

I'm still not sure I completely understand your problem. But it seems to me that if you pre-compute s_ij as

s_ij = F.' * F   %'// [ni x ni] matrix

then what you're after is simply

URmTestp(u,indTest) = URmTrain(u,indTrain) * s_ij(indTrain,indTest);
% or
%URmTestp(u,:) = URmTrain(u,indTrain) * s_ij(indTrain,:);

or if you only compute a smaller s_ij block only for the necessary arrays,

s_ij = F(:,indTrain).' * F(:,indTest);

then

URmTestp(u,indTest) = URmTrain(u,indTrain) * s_ij;

Alternatively, you can always compute the necessary subblock of s_ij on the fly:

URmTestp(u,indTest) = URmTrainp(u,indTrain) * F(:,indTrain).'*F(:,indTest);

If I understand correctly that indTest and indTrain are functions of u, such as

URmTestp = zeros(n_u,n_i);  %// pre-allocate here!
for u=1:n_u
    indTest = testCell{u};
    indTrain = trainCell{u};
    URmTestp(u,indTest) = URmTrainp(u,indTrain) * F(:,indTrain).'*F(:,indTest); %'
    ...
end

then probably not much can be vectorized on this loop, unless there's a very tricky indexing scheme that allows you to use linear indices. I'd stick with this setup.