0
votes

I have been using R for long time and I am recently learning Python. I would like to create multiple box plots in one panel in Python. My dataset is in a vector form and a label vector indicates which box plot each element of data corresponds. The example looks like this:

 N = 50
 data = np.random.lognormal(size=N, mean=1.5, sigma=1.75)
 label = np.repeat([1,2,3,4,5],N/5)

From various websites (e.g., matplotlib: Group boxplots), Creating multiple boxplots requires a matrix object input whose column contains samples for one boxplot. So I created a list object based on data and label:

 savelist = data[ label == 1]
 for i in [2,3,4,5]:
      savelist = [savelist, data[ label == i]]

However, the code below gives me an error:

 boxplot(savelist)

 Traceback (most recent call last):

 File "<ipython-input-222-1a55d04981c4>", line 1, in <module>
 boxplot(savelist)

 File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/matplotlib/pyplot.py", line 2636, in boxplot
meanprops=meanprops, manage_xticks=manage_xticks)

 File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 3045, in boxplot labels=labels)

 File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/matplotlib/cbook.py", line 1962, in boxplot_stats
stats['mean'] = np.mean(x)

 File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/numpy/core/fromnumeric.py", line 2727, in mean
out=out, keepdims=keepdims)

 File "/Users/yumik091186/anaconda/lib/python2.7/site-packages/numpy/core/_methods.py", line 66, in _mean
ret = umr_sum(arr, axis, dtype, out, keepdims)

ValueError: operands could not be broadcast together with shapes (2,) (10,) 

Can anyone explain what is going on?

1

1 Answers

1
votes

You're ending up with a nested list instead of a flat list. Try this instead:

savelist = [data[label == 1]]
for i in [2,3,4,5]:
    savelist.append(data[label == i])

And it should work.