Here is a template test to use, including mikkola's answer:
vec = rand(1,400);
A = rand(3600,400);
n_expts = 1000;
format long
disp(version);
% Warm up
for ii = 1:n_expts
vec0 = repmat(vec,1,9);
res1 = repmat(vec0',1,400)-A;
res3 = bsxfun(@minus, vec0.', A);
end
tic
for ii = 1:n_expts
vec0 = repmat(vec, 1, 9);
res1 = repmat(vec0.', 1, 400) - A;
end
fprintf('Time taken with 2 repmats: ')
disp(toc/n_expts)
tic
for ii = 1:n_expts
res2 = repmat(vec.', 9, 400) - A;
end
fprintf('Time taken with 1 repmat and transpose: ')
disp(toc/n_expts)
tic
for ii = 1:n_expts
res3 = bsxfun(@minus, vec0.', A);
end
fprintf('Time taken with bsxfun: ')
disp(toc/n_expts)
% Check that all the fi are the same
dres1 = max(max(abs(res1 - res2)));
dres2 = max(max(abs(res1 - res3)));
tol = eps;
if (dres1 > eps) | (dres2 > eps)
fprintf('Difference in output matrices');
end
With results
8.3.0.532 (R2014a)
Time taken with 2 repmats: 0.004027661867427
Time taken with 1 repmat and transpose: 0.004034170491803
Time taken with bsxfun: 0.003970521454027