I'm studying simple multiplication of two big matrices using the Eigen library. This multiplication appears to be noticeably slower than both Matlab and Python for the same size matrices.
Is there anything to be done to make the Eigen operation faster?
Problem Details
X : random 1000 x 50000 matrix
Y : random 50000 x 300 matrix
Timing experiments (on my late 2011 Macbook Pro)
Using Matlab: X*Y takes ~1.3 sec
Using Enthought Python: numpy.dot( X, Y) takes ~ 2.2 sec
Using Eigen: X*Y takes ~2.7 sec
Eigen Details
You can get my Eigen code (as a MEX function): https://gist.github.com/michaelchughes/4742878
This MEX function reads in two matrices from Matlab, and returns their product.
Running this MEX function without the matrix product operation (ie just doing the IO) produces negligible overhead, so the IO between the function and Matlab doesn't explain the big difference in performance. It's clearly the actual matrix product operation.
I'm compiling with g++, with these optimization flags: "-O3 -DNDEBUG"
I'm using the latest stable Eigen header files (3.1.2).
Any suggestions on how to improve Eigen's performance? Can anybody replicate the gap I'm seeing?
UPDATE The compiler really seems to matter. The original Eigen timing was done using Apple XCode's version of g++: llvm-g++-4.2.
When I use g++-4.7 downloaded via MacPorts (same CXXOPTIMFLAGS), I get 2.4 sec instead of 2.7.
Any other suggestions of how to compile better would be much appreciated.
You can also get raw C++ code for this experiment: https://gist.github.com/michaelchughes/4747789
./MatProdEigen 1000 50000 300
reports 2.4 seconds under g++-4.7