Discretize normal distribution to get prob of a random variable

Question

Suppose I draw randomly from a normal distribution with mean zero and standard deviation represented by a vector of, say, dimension 3 with

scale_rng=np.array([1,2,3])
eps=np.random.normal(0,scale_rng)

I need to compute a weighted average based on some simulations for which I draw the above mentioned eps. The weights of this average are "the probability of eps" (hence I will have a vector with 3 weights). For weighted average I simply mean an arithmetic sum wehere each component is multiplied by a weight, i.e. a number between 0 and 1 and where all the weights should sum up to one. Such weighted average shall be calculated as follows: I have a time series of observations for one variable, x. I calculate an expanding rolling standard deviation of x (say this is the values in scale). Then, I extract a random variable eps from a normal distribution as explained above for each time-observation in x and I add it to it, say obtaining y=x+eps. Finally, I need to compute the weighted average of y where each value of y is weighted by the "probability of drawing each value of eps from a normal distribution with mean zero and standard deviation equal to scale.

Now, I know that I cannot think of this being the points on the pdf corresponding to the values randomly drawn because a normal random variable is continuous and as such the pdf at a certain point is zero. Hence, the only solution I Found out is to discretize a normal distribution with a certain number of bins and then find the probability that a value extracted with the code of above is actually drawn. How could I do this in Python?

EDIT: the solution I found is to use

norm.cdf(eps_it+0.5, loc=0, scale=scale_rng)-norm.cdf(eps_it-0.5, loc=0, scale=scale_rng)

which is not really based on the discretization but at least it seems feasible to me "probability-wise".

the normal distribution has a well defined probability density function, and appears in scipy as stats.norm. why not use that? — Sam Mason
@SamMason the pdf does not give me the probability of a countinuous random variable being equal to a specific value but rather it being equal to an interval. Moreover, the pdf can well be above 1 in a small interval, and this makes it not suitable to produce weights for a weighted average — Matteo
that's not quite right, but am not sure how to best explain. can post an answer showing how to use the PDF to get a weighted average. note that drawing eps from a normal and then further weighting by a normal PDF might be doing unexpected things to your estimates — Sam Mason
unfortunately it is not my idea the one of using the pdf of eps to build a weighted average but someone else, and I have to reproduce such results. Unfortunately I could not find any further help online — Matteo
Do you know the limits for your bins? You might be looking unequal bin widths to achieve equal probabilities, or fixed bin widths (except for the lowest and highest, since the range of a normal is infinite) yielding unequal probabilities, or maybe something else entirely. I think you need to supply some more information in your question. — pjs

Sam Mason Sam Mason · Accepted Answer · 2020-12-01T18:59:44

here's an example leaving everything continuous.

import numpy as np
from scipy import stats

# some function we want a monte carlo estimate of
def fn(eps):
  return np.sum(np.abs(eps), axis=1)

# define distribution of eps
sd = np.array([1,2,3])
d_eps = stats.norm(0, sd)

# draw uniform samples so we don't double apply the normal density
eps = np.random.uniform(-6*sd, 6*sd, size=(10000, 3))

# calculate weights (working with log-likelihood is better for numerical stability)
w = np.prod(d_eps.pdf(eps), axis=1)
# normalise so weights sum to 1
w /= np.sum(w)

# get estimate
np.sum(fn(eps) * w)

which gives me 4.71, 4.74, 4.70 4.78 if I run it a few times. we can verify this is correct by just using a mean when eps is drawn from a normal directly:

np.mean(fn(d_eps.rvs(size=(10000, 3))))

which gives me essentially the same values, but with expected lower variance. e.g. 4.79, 4.76, 4.77, 4.82, 4.80.

Discretize normal distribution to get prob of a random variable

1 Answers