I'm working to implement the following equation:
X =(Y.T * Y + Y.T * C * Y) ^ -1
Y is a (n x f) matrix and C is (n x n) diagonal one; n is about 300k and f will vary between 100 and 200. As part of an optimization process this equation will be used almost 100 million times so it has to be processed really fast.
Y is initialized randomly and C is a very sparse matrix with only a few numbers out of the 300k on the diagonal will be different than 0.Since Numpy's diagonal functions creates dense matrices, I created C as a sparse csr matrix. But when trying to solve the first part of the equation:
r = dot(C, Y)
The computer crashes due Memory limits. I decided then trying to convert Y to csr_matrix and make the same operation:
r = dot(C, Ysparse)
and this approach took 1.38 ms. But this solution is somewhat "tricky" since I'm using a sparse matrix to store a dense one, I wonder how efficient this really.
So my question is if is there some way of multiplying the sparse C and the dense Y without having to turn Y into sparse and improve performance? If somehow C could be represented as diagonal dense without consuming tons of memory maybe this would lead to very efficient performance but I don't know if this is possible.
I appreciate your help!