6
votes

Here is the histogram enter image description here

To generate this plot, I did:

bins = np.array([0.03, 0.3, 2, 100])
plt.hist(m, bins = bins, weights=np.zeros_like(m) + 1. / m.size)

However, as you noticed, I want to plot the histogram of the relative frequency of each data point with only 3 bins that have different sizes:

bin1 = 0.03 -> 0.3

bin2 = 0.3 -> 2

bin3 = 2 -> 100

The histogram looks ugly since the size of the last bin is extremely large relative to the other bins. How can I fix the histogram? I want to change the width of the bins but I do not want to change the range of each bin.

2
but then it's not a histogram anymore, is it? - cel
@cel, no, it can be a bar graph. - aloha
Well, have you tried plotting a bar graph? You get the number of counts in each bin from np.histogram, so the implementation should be straight forward. - cel
@cel - yes I tried it. I still didn't figure out a way to change the numbers on the xaxis . - aloha

2 Answers

13
votes

As @cel pointed out, this is no longer a histogram, but you can do what you are asking using plt.bar and np.histogram. You then just need to set the xticklabels to a string describing the bin edges. For example:

import numpy as np
import matplotlib.pyplot as plt

bins = [0.03,0.3,2,100] # your bins
data = [0.04,0.07,0.1,0.2,0.2,0.8,1,1.5,4,5,7,8,43,45,54,56,99] # random data

hist, bin_edges = np.histogram(data,bins) # make the histogram

fig,ax = plt.subplots()

# Plot the histogram heights against integers on the x axis
ax.bar(range(len(hist)),hist,width=1) 

# Set the ticks to the middle of the bars
ax.set_xticks([0.5+i for i,j in enumerate(hist)])

# Set the xticklabels to a string that tells us what the bin edges were
ax.set_xticklabels(['{} - {}'.format(bins[i],bins[i+1]) for i,j in enumerate(hist)])

plt.show()

enter image description here

EDIT

If you update to matplotlib v1.5.0, you will find that bar now takes a kwarg tick_label, which can make this plotting even easier (see here):

hist, bin_edges = np.histogram(data,bins)

ax.bar(range(len(hist)),hist,width=1,align='center',tick_label=
        ['{} - {}'.format(bins[i],bins[i+1]) for i,j in enumerate(hist)])
2
votes

If your actual values of the bins are not important but you want to have a histogram of values of completely different orders of magnitude, you can use a logarithmic scaling along the x axis. This here gives you bars with equal widths

import numpy as np
import matplotlib.pyplot as plt

data = [0.04,0.07,0.1,0.2,0.2,0.8,1,1.5,4,5,7,8,43,45,54,56,99]

plt.hist(data,bins=10**np.linspace(-2,2,5)) 
plt.xscale('log')

plt.show()

When you have to use your bin values you can do

import numpy as np
import matplotlib.pyplot as plt

data = [0.04,0.07,0.1,0.2,0.2,0.8,1,1.5,4,5,7,8,43,45,54,56,99]
bins = [0.03,0.3,2,100] 

plt.hist(data,bins=bins) 
plt.xscale('log')

plt.show()

However, in this case the widths are not perfectly equal but still readable. If the widths must be equal and you have to use your bins I recommend @tom's solution.