6
votes

I have just run a simple task of trying to plot the probability density histogram for a simulation I ran. However, when I plot it, the probability for each bin seems not to match the result of the frequency plot. with 50 bins i would expect each bin to have an average probability of 2% which is not reflected in the chart.

Thanks in advance

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

plntAcres = 88.0
hvstPer = 0.99
hvstAcres = plntAcres*hvstPer
yldAcre = np.random.triangular(47,48,49, 10000)

carryIn = 464
pdn = hvstAcres * yldAcre
imp = 25.0
ttlSup = carryIn + pdn + imp

crush = np.random.uniform(1945, 1990,10000)
expts = np.random.uniform(2085, 2200,10000)
seedRes = 130
ttlDem = crush + expts + seedRes

carryOut = ttlSup - ttlDem

print carryOut

plt.hist(carryOut, bins=50,normed=True)
plt.title("Carry Out Distribution")
plt.xlabel("Value")
plt.ylabel("Probability")
plt.show()

Probability density of Carry out

2

2 Answers

10
votes

In the hist function, the normed argument does not result in probabilites, but in probability densities. If you want the probabilities themselves, use the weights argument instead (and supply with 1 / len(carryOut)).

The crucial two lines:

weights = np.ones_like(carryOut) / (len(carryOut))
plt.hist(carryOut, bins=50, weights=weights)
0
votes

Your schema is a Bell Curve, usually means that your random variable is normally distributed. Check wikipedia for Normal Distribution / Gauss distribution