3
votes

I need to make a matrix/vector multiplication in Matlab of very large sizes: "A" is an 655360 by 5 real-valued matrix that are not necessarily sparse and "B" is a 655360 by 1 real-valued vector. My question is how to compute: B'*A efficiently.

I have notice a slight time improvement by computing A'*B instead, which gives a column vector. But still it is quite slow (I need to perform this operation several times in the program).

With a little bit search I found an interesting Matlab toolbox MTIMESX by James Tursa, which I hoped would improve the above matrix multiplication performance. After several trials, I can only have very marginal gains over the Matlab native matrix multiplication.

Any suggestions about how should I rewrite A'*B so that the operation is more efficient? Thanks.

5
I think for matrix operations, Matlab performance is already close to the best you can have, since matrix ops are already optimized and parallelized . - jpjacobs
As many here mentioned Matlab should have no problem handling such matrix multiplication. However your question suggest there is something very very wrong with your code or your system: Multiplying vectors of this size on my i7 machine takes around 0.003 seconds. Even if we assume older machines are 300 times slower, the computation should take under a second! There isn't suppose to be a memory issue as well since matrix "A" requires only 26 MB of memory. - Yanir Kleiman

5 Answers

10
votes

Matlab's raison d'etre is doing matrix computations. I would be fairly surprised if you could significantly outperform its built-in matrix multiplication with hand-crafted tools. First of all, you should make sure your multiplication can actually be performed significantly faster. You could do this by implementing a similar multiplication in C++ with Eigen.

3
votes

I have had good results with matlab matrix multiplication using the GPU

1
votes

In order to avoid the transpose operation, you could try:

sum(bsxfun(@times, A, B), 2)

But I would be astonished it was faster than the direct version. See @thiton's answer.

Also look at http://www.mathworks.co.uk/company/newsletters/news_notes/june07/patterns.html to see why the column-vector-based version is faster than the row-vector-based version.

1
votes

Matlab is built using fairly optimized libraries (BLAS, etc.), so you can't easily improve upon it from within Matlab. Where you can improve is to get a better BLAS, such as one optimized for your processor - this will enable better use of the caches by getting appropriately sized blocks of data from main memory. Take a look into creating your own compiled versions of ATLAS, ACML, MKL, and Goto BLAS.

I wouldn't try to solve this one particular multiplication unless it's really killing you. Changing up the BLAS is likely to lead to a happier solution, especially if you're not currently making use of multicore processors.

0
votes

Your #1 option, if this is your bottleneck, is to re-examine your algorithm. See this question Optimizing MATLAB code for a great example of how choosing a different algorithm reduced runtime by three orders of magnitude.