Suppose I have two dense matrices U (10000x50) and V(50x10000), and one sparse matrix A(10000x10000). Each element in A is either 1 or 0. I hope to find A*(UV), noting that '*' is element-wise multiplication. To solve the problem, Scipy/numpy will calculate a dense matrix UV first. But UV is dense and large (10000x10000) so it's very slow.
Because I only need a few elements of UV indicated by A, it should save a lot of time if only necessary elements are calculated instead of calculating all elements then filtering using A. Is there a way to instruct scipy to do this?
BTW, I used Matlab to solve this problem and Matlab is smart enough to find what I'm trying to do and works efficiently.
Update: I found Matlab calculated UV fully as scipy does. My scipy installation is simply too slow...
einsum
notation, you wantnp.einsum('ij,ik,kj->ij',A,U,V)
? (not that I"m suggesting using einsum). – hpauljA.*U*V
? – hpauljA*(U.dot(V))
with the sizes above, and a dense A array takes 1.04 second to compute. Are you sure you have Numpy compiled with a fast BLAS library? I cant see a way around explicitly calculating all the elements, even though A is sparse. On the other hand, how do you know that Matlab is not doing that too? – rthA
to generateU1
andV1
such thatU1.dot(V1)
is your answer. It might be easiest is each row and column ofA
has at most one1
, but there should be a way even if there are multiple values. – hpauljtimeit(@() A.*(U*V))
andtimeit(@() U*V)
run about equally long. So the sparsity ofA
is only taken into account afterU*V
is calculated. – user2379410