2
votes

I need to refer to specific scipy sparse matrix columns

In pandas I'd write for example:

data_sims.columns[1]

data_sims is csr scipy matrix. if I write data_sims[:,j], then i get all the rows by the column, but i can't reffer to the specific column? How to do it nicely>?

for i in tqdm(range(0, data_sims.shape[0])):
     for j in range(1,data_sims.shape[1]):
        user = data_sims[i].data
        product = data_sims[:,j].data

data_sims has just user's id rows and column names data_sims is <1257x286 sparse matrix of type '' with 1257 stored elements in Compressed Sparse Row format> array([ 1.00000000e+00, 3.30000000e+01, 4.20000000e+01, ..., 1.96620000e+04, 1.96720000e+04, 1.96950000e+04]) –

i would like just to refer to the column for example getcol(2) gives me array of all values in col2 , but is it possible just refer to the col2 instead of getting values of the col2? data_sims.columns[2] –

1
Can you show a small sample matrix? It's not clear what you need and why data_sims[:,j] is not okayuser2314737
data_sims has just user's id rows and column names data_sims is <1257x286 sparse matrix of type '<class 'numpy.float64'>' with 1257 stored elements in Compressed Sparse Row format> array([ 1.00000000e+00, 3.30000000e+01, 4.20000000e+01, ..., 1.96620000e+04, 1.96720000e+04, 1.96950000e+04])Ivan Shelonik
Please edit your question and amend the relevant information.Nils Werner

1 Answers

0
votes

Demo:

In [20]: from scipy import sparse as sp

In [21]: M = sp.random(20, 5, .2, 'csr')

In [22]: M
Out[22]:
<20x5 sparse matrix of type '<class 'numpy.float64'>'
        with 20 stored elements in Compressed Sparse Row format>

In [23]: M.A
Out[23]:
array([[ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.28107916,  0.        ,  0.        ],
       [ 0.87837137,  0.13842525,  0.        ,  0.        ,  0.23325649],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.52736337],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.04542009,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.63677513,  0.        ,  0.        ],
       [ 0.63231093,  0.62618467,  0.        ,  0.06950421,  0.        ],
       [ 0.        ,  0.43227768,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.5629196 ,  0.        ,  0.        ,  0.89888461,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.72068086,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.39975165,  0.47361848,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  0.        ,  0.        ],
       [ 0.        ,  0.        ,  0.62760683,  0.        ,  0.59258286],
       [ 0.        ,  0.91076085,  0.        ,  0.        ,  0.47079545]])

In [24]: for i in M[:, 2]:
    ...:     if i > 0:
    ...:         print(i)
    ...:
  (0, 0)        0.281079161053
  (0, 0)        0.636775129263
  (0, 0)        0.720680860082
  (0, 0)        0.399751651175
  (0, 0)        0.627606833131

you can also do this:

In [37]: M[:, 2].A
Out[37]:
array([[ 0.        ],
       [ 0.28107916],
       [ 0.        ],
       [ 0.        ],
       [ 0.        ],
       [ 0.        ],
       [ 0.        ],
       [ 0.        ],
       [ 0.63677513],
       [ 0.        ],
       [ 0.        ],
       [ 0.        ],
       [ 0.        ],
       [ 0.        ],
       [ 0.72068086],
       [ 0.39975165],
       [ 0.        ],
       [ 0.        ],
       [ 0.62760683],
       [ 0.        ]])

In [38]: M[:, 2].A.ravel()
Out[38]:
array([ 0.        ,  0.28107916,  0.        ,  0.        ,  0.        ,  0.        ,  0.        ,  0.        ,  0.63677513,  0
.        ,  0.        ,  0.        ,  0.        ,  0.        ,
        0.72068086,  0.39975165,  0.        ,  0.        ,  0.62760683,  0.        ])