I have a large sparse matrix, in the scipy lil_matrix format the size is 281903x281903, it is an adjacency matrix https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.lil_matrix.html
I need a reliable way to get N indexes that are zero. I can't just draw all zero indexes and then choose random ones, since that makes my computer run out of memory. Are there a way of identifying N random indexes without having to trawl through the entire data-structure?
I currently get 10% of the non zero indices the following way (Y is my sparse matrix):
percent = 0.1
oneIdx = Y.nonzero()
numberOfOnes = len(oneIdx[0])
maskLength = int(math.floor(numberOfOnes * percent))
idxOne = np.array(random.sample(range(0,numberOfOnes), maskLength))
maskOne = tuple(np.asarray(oneIdx)[:,idxOne])
I am looking for way to get a mask with the same length as the non zero mask, but with zeros...
nonzero()
method that returns all non-zero indices. Maybe you can sample the complement? – hilberts_drinking_problem