0
votes

I am new to bokeh/pandas and trying to plot a trend line by using month-year for x-axis and integer values for y-axis.

My data looks like below:

year_month  emp_count
0   2015-09     1450425
1   2015-10     3093811
2   2015-11     3316241
3   2015-12     3308658
4   2016-01     3402191

To plot using bokeh I am converting both columns to ndarray. When i convert year-month column to ndarray, it shows each value as a Period. I have used to_period('M') method to get year_month out of a date column.

    temp_df.year_month.values
>>output
    array([Period('2015-09', 'M'), Period('2015-10', 'M'),
           Period('2015-11', 'M'), Period('2015-12', 'M'),
           Period('2016-01', 'M'), Period('2016-02', 'M'),

So when i plot using this data, i get following error:

TypeError: Object of type 'Period' is not JSON serializable

To avoid this error i converted year_month column type to string but i still get the same error. My complete code looks like below:

temp_df.year_month = temp_df.year_month.astype(str)
output_file('trend1.html')
p = figure(title='Employee trend', 
           plot_width=800, 
           plot_height=350,
           x_axis_label='Month-Year', y_axis_label='No of Employees', 
          x_axis_type='datetime')

p.line(x= temp_df.year_month, 
       y = temp_df.emp_count)

show(p)

Does anyone know how to plot year-month on x-axis using bokeh?

2
Does this post solve the question? Link - Samira Kumar

2 Answers

0
votes

I guess I found the problem. You should convert the column to date time.

df['year_month']=pd.to_datetime(df['year_month'])

This should change your column values to as below (day is defaulted to 01):

   year_month      emp_count
0   2015-09-01     1450425
1   2015-10-01     3093811
2   2015-11-01     3316241
3   2015-12-01     3308658
4   2016-01-01     3402191

Then the plot would work. I tested it on a dummy value as output is below.

Value   month_year
2   2018-11-01
3   2018-01-01
4   2018-02-01
5   2018-05-01


sample=pd.DataFrame(pd.read_csv('sample.csv'))
sample['month_year']=pd.to_datetime(sample['month_year'])
p = figure(title='Employee trend', 
           plot_width=800, 
           plot_height=350,
           x_axis_label='Month-Year', y_axis_label='No of Employees', 
          x_axis_type='datetime')

p.scatter(x= sample.month_year, 
       y = sample.Value)

show(p)

Output

Let me know if this works. Thanks

0
votes

I have solved the issue by an alternate approach. Thanks to @Samira for inspiration.

I extracted the year-month from date object and defaulted day to '1'.

    df = df.join(df.as_of_date.apply(lambda x : pd.Series({
    'day': x.day, 
    'year':x.year, 
    'month': x.month, 
    'year_month': x.to_period('M'),
    'year_month_01': pd.datetime(x.year,x.month,1)
})))

After that used 'year_month_01' on axis and bokeh graph looks as expected.

bokeh graph