0
votes

I've got the following dataframe:

dfB1

Date_and_time                   MP  
2020-08-28 19:05:00.066663676   75.0
2020-08-28 19:05:00.133330342   70.0
2020-08-28 19:05:00.199997008   76.0
2020-08-28 19:05:00.266663674   85.0
2020-08-28 19:05:00.333330340   73.0
... ...
2020-08-29 01:59:50.666414770   1454.0
2020-08-29 01:59:50.733081436   1359.0
2020-08-29 01:59:50.799748102   1320.0
2020-08-29 01:59:50.866414768   1217.0
2020-08-29 01:59:50.933081434   1246.0

373364 rows × 1 columns

My goal is to create a plot which displays boxplots for every 1 or 5 or 30 minutes, or even every 1 hour. The datetimeindex is in the correct format (data was collected at 15 Hz, which means every datapoint is 66666666 nanaseconds) in order to index to 'hours'.

dfB1.index

DatetimeIndex(['2020-08-28 19:05:00.066663676',
               '2020-08-28 19:05:00.133330342',
               '2020-08-28 19:05:00.199997008',
               '2020-08-28 19:05:00.266663674',
               '2020-08-28 19:05:00.333330340',
               ...
               '2020-08-29 01:59:50.666414770',
               '2020-08-29 01:59:50.733081436',
               '2020-08-29 01:59:50.799748102',
               '2020-08-29 01:59:50.866414768',
               '2020-08-29 01:59:50.933081434'],
              dtype='datetime64[ns]', name='Date_and_time', length=373364, freq='66666666N')

I've tried plotting using seaborn, and I get a result. But I can't interact with the plot and it is also plotted very poorly. I am familiar with plotly, but I can't seem to find a way to integrate plotly. Also, the minute plot is completely wrong. I only get 59 points on the x-axis. What should I do to interact with the plots and to get boxplots every minute (or every 5 minutes)? I've also read and tried functions described here: Box plot of hourly data in Time Series Python

import seaborn as sns
fig, ax = plt.subplots(figsize=(15,5))
sns.boxplot(x=dfB1.index.hour, y=dfB1['MP'], ax=ax)
1

1 Answers

0
votes

hour gives only the hours, i.e. both 2020-01-01 00:00 and 2020-01-10 00:00 will give 0. I think you want .floor:

sns.boxplot(x=dfB1.index.floor('H'), y=dfB1['MP'], ax=ax)

and also:

sns.boxplot(x=dfB1.index.floor('5Min'), y=dfB1['MP'], ax=ax)