The format I have used is the csr sparse matrix, which is recommended to be the fastest sparse structure for add and dot opertor. I compared its performance with the add and dot operator of np.array. However, it seems very weird that the computing for sparse matrix is much slower than the case under dense format. Why is it? And is there any more efficient way to implement sparse computing?
import numpy as np
import scipy.sparse as sp
import random
#%% generate dense vector
vector_length = 10000
nonzero_term = 200
x = np.zeros((vector_length, ))
y = np.zeros((vector_length, ))
index = random.sample(range(vector_length), nonzero_term)
x[index] = np.random.rand(nonzero_term)
index = random.sample(range(vector_length), nonzero_term)
y[index] = np.random.rand(nonzero_term)
#%% transform to sparse vector
x_sp = sp.csr_matrix(x)
y_sp = sp.csr_matrix(y)
#%% test
# dense add
%timeit [x + y]
# sparse add
%timeit [x_sp + y_sp]
# dense dot
%timeit [x.dot(y)]
# sparse dot
%timeit [x_sp.dot(y_sp.T)]
and the result shows
100000 loops, best of 3: 6.06 µs per loop
10000 loops, best of 3: 97.8 µs per loop
100000 loops, best of 3: 3.45 µs per loop
1000 loops, best of 3: 225 µs per loop