I have access to a number of matrix libraries, but for this project I am using Eigen, due to its compile time definition and its inclusion of SVD.
Now, I am doing the following operation:
Eigen::Matrix<double,M,N> A; // populated in the code
Eigen::Matrix<double,N,N> B = A.transpose() * A;
As I understand, this makes a copy of A and forms the transpose, which is multiplied by A again. This operation is being performed on relatively small matrices (M=20-30,N=3), but many millions of times per second, meaning it must be as fast as possible.
I read that using the following is faster:
B.noalias() = A.transpose() * A;
I could write my own subroutine that accepts A as an input and fills B, but I was wondering if there is an efficient, existing implementation that uses the least amount of cycles.