Creating a COO matrix from a 2D numpy array

Question

I have a 2D numpy array that looks like this,

[[3, 4, 5, 6], [4, 5, 6, 7], [9, 10, 3, 5]]

I converted this into a COO matrix using the following code:

# Flatten 2D array
data = np.asarray(twod_array).flatten()
row = np.arange(0, len(data))
col = np.arange(0, len(row))
# Make COO matrix
mat = coo_matrix((data, (row, col)), shape=(len(row), len(row)))

Is this the correct way of converting a 2D numpy array into a COO matrix?

EDIT

What I am trying to do is this, I have parts on one coloumn and item on the other.

parts                                 item
processor, display, sensor            temp. monitoring system
fan baldes, motor, sensor             motion detecting fan
        .                                       .
        .                                       .

I have converted the data above to numbers so that they can be further processed.

parts         items
1, 2, 3       1
4, 5, 3       2

So now, I want to feed the above data into LightFM, so I created a 2D array like this.

[[1, 2, 3, 1], [4, 5, 3, 2]]

But since LightFM's fit method only takes in np.float32 coo_matrix of shape [n_users, n_items] which is a matrix containing user-item interactions. I converted the 2D array using the above stated method.

Why do you feel this is not correct? Does this cause errors? — cs95
@cᴏʟᴅsᴘᴇᴇᴅ No errors. I am using it to train a LightFM model and the recommendations generated by the model are very weird. — Nikhil Raghavendra
I have a feeling you are jumping into LightFM without understanding what it expects, or even what a sparse matrix is. In your previous question you tried to make a 6d sparse matrix from a dictionary. — hpaulj
@hpaulj Yes, you are right but when I check the documentation of LightFM all it said was it needed a sparse matrix as an argument for the fit method. I am new to this, so pardon me — Nikhil Raghavendra

hpaulj hpaulj · Accepted Answer · 2017-12-18T05:23:50

In [301]: A = np.array([[3, 4, 5, 6], [4, 5, 6, 7], [9, 10, 3, 5]])
In [302]: A
Out[302]: 
array([[ 3,  4,  5,  6],
       [ 4,  5,  6,  7],
       [ 9, 10,  3,  5]])

Your way of creating a matrix:

In [305]: data =A.flatten()
In [306]: M = sparse.coo_matrix((data,(np.arange(len(data)),np.arange(len(data))
     ...: )))
In [307]: M
Out[307]: 
<12x12 sparse matrix of type '<class 'numpy.int32'>'
    with 12 stored elements in COOrdinate format>

print(M) will show those 12 values with their coodinates.

If it isn't too large I like to display the matrix as an array. M.A is short cut for M.toarray():

In [308]: M.A
Out[308]: 
array([[ 3,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  4,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  5,  0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  6,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  4,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  5,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  6,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  7,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  9,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0, 10,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  3,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  5]])

Look at the diagonal - that's the 12 values of the original array. Is that what you want? The original 3x4 layout of A is completely lost. It might as well have been a 1d list of those 12 numbers.

Alternatively you could just pass the array to the sparse constructor, producing a sparse replica of the original

In [309]: M1 = sparse.coo_matrix(A)
In [310]: M1
Out[310]: 
<3x4 sparse matrix of type '<class 'numpy.int32'>'
    with 12 stored elements in COOrdinate format>
In [311]: M1.A
Out[311]: 
array([[ 3,  4,  5,  6],
       [ 4,  5,  6,  7],
       [ 9, 10,  3,  5]])

Instead of a 12x12 diagonal, this is a 3x4 array without any 0's. This makes more sense if A already had lots of 0s.

Do you really know what kind of sparse matrix you need?

Creating a COO matrix from a 2D numpy array

1 Answers