1
votes

When converting the COO matrix to the CSR format the duplicates get summed. In the earler versions of scipy there was an flag sum_duplicates=False, but now it is gone. Is there a way to convert COO to CSR matrix without summing the duplicates?

Below is an example generating a coo matrix:

from scipy.sparse import csr_matrix,coo_matrix

i_tar = np.array([3,3,3,3,3,6,7,8,8,2,2,2])
j_src = np.array([8,8,8,8,8,5,7,18,18,4,4,4])
data = np.array([3,4,6,8,40,6,12,5,6,2,3,9])

coo_mat = coo_matrix((data, (i_tar,j_src)),shape=(20,20))

print coo_mat
  (3, 8)  3
  (3, 8)  4
  (3, 8)  6
  (3, 8)  8
  (3, 8)  40
  (6, 5)  6
  (7, 7)  12
  (8, 18) 5
  (8, 18) 6
  (2, 4)  2
  (2, 4)  3
  (2, 4)  9

Converting it to the CSR format we find:

print coo_mat.tocsr()
  (2, 4)  14
  (3, 8)  61
  (6, 5)  6
  (7, 7)  12
  (8, 18) 11

which has duplicates summed. I imaging there are good reasons for summing the duplicates, however, it is not what I need.

1
What is the expected output? The sum is the standard way to show it, Hyuck Kang gives another solution (choose final row). What do you expect?kabanus

1 Answers

0
votes

Try this code,

coo_mat.todia().tocsr()