0
votes

So I had this statistics homework and I wanted to do it with python and numpy. The question started with making of 1000 random samples which follow normal distribution. random_sample=np.random.randn(1000)
Then it wanted to divided these numbers to some subgroups . for example suppose we divide them to five subgroups.first subgroup is random numbers in range of (-5,-3)and it goes on to the last subgroup (3,5). Is there anyway to do it using numpy (or anything else)? And If it's possible I want it to work when the number of subgroups are changed.

2

2 Answers

0
votes

You can get subgroup indices using numpy.digitize:

random_sample = 5 * np.random.randn(10)
random_sample
# -> array([-3.99645573,  0.44242061,  8.65191515, -1.62643622,  1.40187879,
#            5.31503683, -4.73614766,  2.00544974, -6.35537813, -7.2970433 ])
indices = np.digitize(random_sample, (-3,-1,1,3))
indices
# -> array([0, 2, 4, 1, 3, 4, 0, 3, 0, 0])
0
votes

If you sort your random_sample, then you can divide this array by finding the indices of the "breakpoint" values — the values closest to the ranges you define, like -3, -5. The code would be something like:

import numpy as np
my_range = [-5,-3,-1,1,3,5] # example of ranges
random_sample = np.random.randn(1000)
hist = np.sort(random_sample)
# argmin() will find index where absolute difference is closest to zero
idx = [np.abs(hist-i).argmin() for i in my_range]
groups=[hist[idx[i]:idx[i+1]] for i in range(len(idx)-1)]

Now groups is a list where each element is an array with all random values within your defined ranges.