As you have been told, there is a memory problem. You want to do this:
numpy.dot(A, A.T)
which requires a lot of memory for the result (not the operands). However the operation is easy to perform in pieces. You may use a loop-based approach to produce one output row at a time:
def trans_multi(A):
rows = A.shape[0]
result = numpy.empty((rows, rows), dtype=A.dtype)
for r in range(rows):
result[r,:] = numpy.dot(A, A[r,:].T)
return result
This, as such, is just a slower and equally memory consuming version (numpy.dot is well-optimized). However, what you most probably want to do is to write the result into a file, as you do not have the memory to hold the result:
def trans_multi(A, filename):
with open(filename, "wb") as f:
rows = A.shape[0]
for r in range(rows):
f.write(numpy.dot(A, A[r,:].T).tostring())
Yes, it is not exactly lightning-fast. However, it is most probably the fastest you can hope for. Sequential writes are usually well-optimized. I tried:
a=numpy.random.random((50000,100)).astype('float32')
trans_multi(a,"/tmp/large.dat")
This took approximately 60 seconds, but it really depends on your HDD performance.
Why not memmap?
I like mmap, and numpy.memmap is a great thing. However, numpy.memmap is optimized for having large tables and calculating small results from them. There is, for example memmap.dot which is optimized for taking dot products of memmapped arrays. The scenario is that the operands are memmapped, but the result is in RAM. Exactly the other way round, that is.
Memmapping is very useful when you have random access. Here the access is not random access but sequential write. Also, yf you try to use numpy.memmap to create a (50000,50000) array of float32's, it will take some time (for some reason I do not get, maybe it initializes the data even though there is no need).
However, after the file has been created, it is a very good idea to use numpy.memmap to analyze the huge table, as it gives the best random read performance and a very convenient interface.
MemoryErroron my box (NumPy 1.8.1). Be aware that the resulting matrix would take 9.3GB infloat32format, twice that infloat64. - Fred Foonp.memmaparrays - Saullo G. P. Castro