Single histogram from multiple columns in data frame

Question

Right now I have a data-frame which stores multiple y-values measured over a range of wavelengths (x-values), like below:

              x        y0        y1  ...       y12       y13       y14
659  335.072723 -0.003132 -0.002903  ...  0.002981  0.007160  0.003514
660  335.285309 -0.001403 -0.002011  ...  0.003565  0.003635  0.002760
661  335.497925 -0.002544  0.002261  ...  0.007555  0.004373  0.011329
662  335.710510 -0.000546 -0.004526  ...  0.003469  0.004768  0.006191
663  335.923157 -0.011268 -0.001985  ...  0.001623  0.003106  0.006442
..          ...       ...       ...  ...       ...       ...       ...
771  358.970215 -0.000100 -0.000876  ...  0.001332 -0.000311  0.002851
772  359.184387 -0.000897  0.001286  ... -0.000037  0.000300  0.001410
773  359.398560 -0.001188  0.001162  ... -0.000901  0.000200  0.001739
774  359.612762  0.001268 -0.001201  ... -0.000545  0.001514  0.000689
775  359.826965  0.001439 -0.000671  ...  0.000295  0.002545  0.002769

I am able to create a histogram from the first column, using

plt.hist(megDiff['y0'], bins = 100, label = "y0 Data Set")

However, I would like to create a single histogram from all of the y-value data sets. I am not quite sure how to do this, if anyone knows I would greatly appreciate some help

jose_bacoy jose_bacoy · Accepted Answer · 2022-07-28T19:26:34

You can combine (concat) all the y* columns into one then do the plot.

import pandas as pd
import matplotlib.pyplot as plt
import itertools as it

df = pd.DataFrame({'x': [1, 2, 3],
                   'y1': [11, 23, 31],
                   'y2': [12, 22, 32],
                   'y3': [13, 21, 33],})

plt.hist(sorted(it.chain(*df.loc[:, df.columns != 'A'].values)), bins = 100, label = "y0 Data Set")

Sample result:

Single histogram from multiple columns in data frame

1 Answers