I am multiplying 2 matrices, A.dot(B), where:
A = 1 x n matrix, dtype float
B = n x n matrix, dtype Boolean
I am performing this calculation for large n, and run out of memory very quickly (around n=14000 fails). A and B are dense.
It appears the reason is that numpy converts B to dtype float before performing the matrix multiplication, hence incurring a huge memory cost. In fact, %timeit suggests it spends more time converting B to float than performing the multiplication.
Is there a way round this? Emphasis here is on reducing the memory spike / float conversion, while still allowing common matrix functionality (matrix addition / multiplication).
Here's reproducible data for benchmarking solutions:
np.random.seed(999)
n = 30000
A = np.random.random(n)
B = np.where(np.random.random((n, n)) > 0.5, True, False)
Bto float yourself prior to usingdoti.e. doesA.dot(B.astype(float)shows the same behavior? - ClebA.dot(B.astype(float)andA.dot(B)If I convert B to float beforehand, calculation time drops significantly. I'm aiming to perform this calculation for much larger n (at least n=30,000), so the memory improvement is critical, even on a machine with more memory. - jpp