performing sum of outer products on sparse matrices

Question

I am trying to implement the following equation using scipy's sparse package:

W = x[:,1] * y[:,1].T + x[:,2] * y[:,2].T + ...

where x & y are a nxm csc_matrix. Basically I'm trying to multiply each col of x by each col of y and sum the resulting nxn matrices together. I then want to make all non-zero elements 1.

This is my current implementation:

    c = sparse.csc_matrix((n, n))
    for i in xrange(0,m):
        tmp = bam.id2sym_thal[:,i] * bam.id2sym_cort[:,i].T
        minimum(tmp.data,ones_like(tmp.data),tmp.data)
        maximum(tmp.data,ones_like(tmp.data),tmp.data)

        c = c + tmp

This implementation has the following problems:

Memory usage seems to explode. As I understand it, memory should only increase as c becomes less sparse, but I am seeing that the loop starts eating up >20GB of memory with a n=10,000, m=100,000 (each row of x & y only has around 60 non-zero elements).
I'm using a python loop which is not very efficient.

My question: Is there a better way to do this? Controlling memory usage is my first concern, but it would be great to make it faster!

Thank you!

x[:,i] is going to give you the ith column of x, not the row — JoshAdel
@JoshAdel: You are right, I misspoke, I meant to say multiply the columns of x by columns of y. I have updated the question. Thanks! — RussellM
Your equation is a sum of inner products, not outer products. You must transpose the columns of y, not x. (Either that, or the title is wrong.) — Steve Tjoa
Please edit your question to be unambiguous respect to transpose. Are you aiming to count how many times each nonzero element is summed in outer product? Thanks — eat
@Steve: You are right Steve- I have made the correction. Thanks — RussellM

Steve Tjoa Steve Tjoa · Accepted Answer · 2011-08-05T08:59:41

Note that a sum of outer products in the manner you describe is simply the same as multiplying two matrices together. In other words,

sum_i X[:,i]*Y[:,i].T == X*Y.T

So just multiply the matrices together.

Z = X*Y.T

For n=10000 and m=100000 and where each column has one nonzero element in both X and Y, it computes almost instantly on my laptop.

performing sum of outer products on sparse matrices

2 Answers