1
votes
# A histogram
n = np.random.randn(100000)
fig, axes = plt.subplots(1, 2, figsize=(12,4))

axes[0].hist(n)
axes[0].set_title("Default histogram")
axes[0].set_xlim((min(n), max(n)))

axes[1].hist(n, cumulative=True, bins=50)
axes[1].set_title("Cumulative detailed histogram")
axes[1].set_xlim((min(n), max(n)));

ipython notebook histogram

This is from an ipython notebook here In[41]

It seems that the histogram bars don't correctly align with the grids (see first subplot). That is the same problem I face in my own plots.

Can someone explain why?

2
Can you include the code to reproduce your problem in the question? Your ipython notebook link will rot. - tacaswell

2 Answers

2
votes

Look for the align option in matplotlib hist. You can align left, right, or center. By default your bins will not be centered which is why you see left aligned bins. This is spelled out in the matplotlib hist docs: http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.hist

1
votes

What if you have a gaussian that spread from -2647 to +1324 do yo expect to have 3971 bins ? maybe too much. 39 ? then you are off by 0.71. what about 40 ? Off by 0.29.

The way histogram works is you can set the bins= parameter (number of bins, default 10). On the right graph, the scale seem to go from around -4.5 to +4.5 which make a span of 9 divided by 10 bins that gives 0.9/bin.

Also when you do histogram, it is not obvious "how" you want to bin things and represent it. if you have a bin from 0 to 1, is it 0 < x <= 1, 0 <= x < 1 ? if you have only integer values, I suspect you would also prefer bins to be centered around integer values ? right ?

So histogram is a quick method that give you insight in the data, but does not prevent you from setting its parameters to represent the data the way yo like.

This blog post has nice demo of affect of parameter in histogram plotting and explain some alternate methods of plotting.