Adjusting the mean and standard deviation of a list of random numbers?

1

votes

I have to create a list of random numbers (with decimals) in the range between -3 and 3. The problem is that the list must have a mean of 0 and a standard deviation of 1. How can I adjust the mean and standard deviation parameters? Is there a function I can use?

I was already able to create a list of random numbers between -3 and 3.

import random


def lista_aleatorios(n):
    lista = [0] * n
    for i in range(n):
        lista[i] = random.uniform(-3, 3)
    return lista

print("\nHow many numbers do you want?: ")
n = int(input())

print (lista_aleatorios(n))

pythonrandommeanpython-3.7normal-distribution

What exactly do you mean a mean of 0? Distribution mean value is different from sample mean value - if I start sampling from simple gaussian, N(0,1), no bounds, then even if distribution mean is 0, sampling mean would be different and would be closing on 0 when number of samples is going to infinity. Do you want distribution mean to be zero? Or you want sample mean (actually, sum) of any sampled sequence to be exactly zero all the time? Those two conditions are quite different – Severin Pappadeux

0

votes

The function random.normalvariate(mu, sigma) allows you to specify the mean and the stdev for normally distributed random variables.

0

votes

Use random.gauss, then scale:

import numpy as np

from random import gauss

def bounded_normal(n, mean, std, lower_bound, upper_bound):

    # generate numbers between lower_bound and upper_bound

    result = []
    for i in range(n):
        while True:
            value = gauss(mean, std)
            if lower_bound < value < upper_bound:
                break

        result.append(value)

    # modify the mean and standard deviation

    actual_mean = np.mean(result)
    actual_std = np.std(result)
    mean_difference = mean - actual_mean
    std_difference = std / actual_std
    new_result = [(element + mean_difference) * std_difference for element in result]

    return new_result

0

votes

Ok, here is quick way to solution (if you want to use truncated gaussian). Set boundaries and desired stddev. I assume mean is 0. Then quick-and-crude code to do binary search for distribution sigma, solving for non-linear root (brentq() should be used in production code). All formulas are taken from Wiki page on Truncated Normal. It (sigma) shall be larger than desired stddev due to the fact, that truncation removes random values which contribute to large stddev. Then we do quick sampling test - and mean and stddev are close to desired values but never exactly equal to them. Code (Python-3.7, Anaconda, Win10 x64)

import numpy as np
from scipy.special import erf
from scipy.stats import truncnorm

def alpha(a, sigma):
    return a/sigma

def beta(b, sigma):
    return b/sigma

def xi(x, sigma):
    return x/sigma

def fi(xi):
    return 1.0/np.sqrt(2.0*np.pi) * np.exp(-0.5*xi*xi)

def Fi(x):
    return 0.5*(1.0 + erf(x/np.sqrt(2.0)))

def Z(al, be):
    return Fi(be) - Fi(al)

def Variance(sigma, a, b):
    al = alpha(a, sigma)
    be = beta(b, sigma)
    ZZ = Z(al, be)

    return sigma*sigma*(1.0 + (al*fi(al) - be*fi(be))/ZZ - ((fi(al)-fi(be))/ZZ)**2)

def stddev(sigma, a, b):
    return np.sqrt(Variance(sigma, a, b))

m = 0.0 # mean
s =  1.0 # this is what we want
a = -3.0 # left boundary
b =  3.0 # right boundary

#print(stddev(s , a, b))
#print(stddev(s + 0.1, a, b))

slo = 1.0
shi = 1.1

stdlo = stddev(slo, a, b)
stdhi = stddev(shi, a, b)

sigma = -1.0
while True: # binary search for sigma
    sme = (slo + shi) / 2.0
    stdme = stddev(sme, a, b)
    if stdme - s == 0.0:
        sigma = stdme
        break
    elif stdme - s < 0.0:
        slo = sme
    else:
        shi = sme

    if shi - slo < 0.0000001:
        sigma = (shi + slo) / 2.0
        break

print(sigma) # we got it, shall be slightly bigger than s, desired stddev

np.random.seed(73123457)

rvs = truncnorm.rvs(a, b, loc=m, scale=sigma, size=1000000) # quick sampling test

print(np.mean(rvs))
print(np.std(rvs))

For me it printed

sigma = 1.0153870105743408
mean = -0.000400729471992301
stddev = 1.0024267696681475

with different seed or sequence length you might get output like

1.0153870105743408
-0.00015923177289006116
0.9999974266369461

Adjusting the mean and standard deviation of a list of random numbers?

3 Answers