2
votes

I have a relatively large array, e.g. 200 x 1000.The matrix is a sparse matrix where elements are can be considered weights. The weights range from 0 to 500. I would like to generate a new array of the same size, 200x1000, where N of the elements of the new array are random integers {0,1}. The probability of an element in the new matrix being 0 or 1 is proportional to the weights from the original array - the higher the weight the higher the probability of 1 versus 0.

Stated another way: I would like to generate a zero matrix of size 200x1000 and then randomly choose N elements to flip to 1 based on a 200x1000 matrix of weights.

3

3 Answers

4
votes

I'll throw my proposed solution in here as well:

# for example
a = np.random.random_integers(0, 500, size=(200,1000))
N = 200

result = np.zeros((200,1000))
ia = np.arange(result.size)
tw = float(np.sum(a.ravel()))
result.ravel()[np.random.choice(ia, p=a.ravel()/tw,
                                size=N, replace=False)]=1

where a is the array of weights: that is, pick the indexes for the items to change to 1 from the array ia, weighted by a.

2
votes

This can be done with numpy with

# Compute probabilities as a 1-D array
probs = numpy.float64(weights).ravel()
probs /= numpy.sum(probs)

# Pick winner indexes
winners = numpy.random.choice(len(probs), N, False, probs)

# Build result
result = numpy.zeros(weights.shape, numpy.uint8)
result.ravel()[winners] = 1
0
votes

Something like this should work, no reason to get too complicated:

>>> import random
>>> weights = [[1,5],[500,0]]
>>> output = []
>>> for row in weights:
...     outRow = []
...     for entry in row:
...         outRow.append(random.choice([0]+[1 for _ in range(entry)]))
...     output.append(outRow)
...
>>> output
[[1, 1], [1, 0]]

This chooses a random entry from a sequence which always has a single zero and then n 1s where n is the corresponding entry in your weight matrix. In this implementation, a weight of 1 is actually a 50/50 chance of either a 1 or 0. If you want a 50/50 chance to happen at 250 use outRow.append(random.choice([0 for _ in range(500-entry)] + [1 for _ in range(entry)]))