20
votes

I have the following data in a pandas dataframe

       date  template     score
0  20140605         0  0.138786
1  20140605         1  0.846441
2  20140605         2  0.766636
3  20140605         3  0.259632
4  20140605         4  0.497366
5  20140606         0  0.138139
6  20140606         1  0.845320
7  20140606         2  0.762876
8  20140606         3  0.261035
9  20140606         4  0.498010

For every day there will be 5 templates and each template will have a score.

I want to plot the date in the x axis and score in the y axis and a separate line graph for each template in the same figure.

Is it possible to do this using matplotlib?

4
just fast-fast : try to start from the samples : matplotlib.org/examples/pylab_examples/date_demo_rrule.html - Louis
@Louis Just to be clear. I want to know how to plot a grouped dataframe, not about processing dates - Sudar

4 Answers

56
votes

You can use the groupby method:

data.groupby("template").plot(x="date", y="score")
17
votes

I think the easiest way to plot this data with all the lines on the same graph is to pivot it such that each "template" value is a column:

pivoted = pandas.pivot_table(data, values='score', columns='template', index='date')
# Now there will be an index column for date and value columns for 0,1,2,3,4
pivoted.plot()
11
votes

You can use an approach like the following one. You can simply slice the dataframe according to the values of each template, and subsequently use the dates and scores for the plot.

from pandas import *
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime as dt

#The following part is just for generating something similar to your dataframe
date1 = "20140605"
date2 = "20140606"

d = {'date': Series([date1]*5 + [date2]*5), 'template': Series(range(5)*2),
'score': Series([random() for i in range(10)]) } 

data = DataFrame(d)
#end of dataset generation

fig, ax = plt.subplots()

for temp in range(5):
    dat = data[data['template']==temp]
    dates =  dat['date']
    dates_f = [dt.datetime.strptime(date,'%Y%m%d') for date in dates]
    ax.plot(dates_f, dat['score'], label = "Template: {0}".format(temp))

plt.xlabel("Date")
plt.ylabel("Score")
ax.legend()
plt.show()
1
votes

You can add the legend according to the groups with:

plt.legend(pr['template'], loc='best')