Working with unevenly spaced time series data

Question

I am working with a data set that has a timestamp, event duration, and mean value. I would like to resample the data into 15s and 60s intervals. The problem is the timestamps are unevenly spaced.

This is what I've got so far:

from datetime import datetime
import pandas as pd

df = pd.DataFrame([dict(length=pd.to_timedelta(30, unit='s'), value=10),
                   dict(length=pd.to_timedelta(90, unit='s'), value=30),
                   dict(length=pd.to_timedelta(180, unit='s'), value=60),
                   dict(length=pd.to_timedelta(30, unit='s'), value=10)],
                  index=[datetime(2000, 1, 1),
                         datetime(2000, 1, 1, 0, 0, 30),
                         datetime(2000, 1, 1, 0, 3, 0),
                         datetime(2000, 1, 1, 0, 6, 0)])
print(df.resample('30s').mean())

Sample output:

timestamp           value
2000-01-01 00:00:00 10.0
2000-01-01 00:00:30 30.0
2000-01-01 00:01:00 NaN
...

Corrected My desiared output would be:

print(df.resample('15s').mean())

timestamp           value
2000-01-01 00:00:00 5.0
2000-01-01 00:00:15 5.0
2000-01-01 00:00:30 5.0
2000-01-01 00:00:45 5.0
2000-01-01 00:01:00 5.0
...


print(df.resample('60s').mean())

timestamp           value
2000-01-01 00:00:00 20.0
2000-01-01 00:01:00 20.0
2000-01-01 00:02:00 20.0
...

An idea I had would be to manually upsample the data creating a record in the series for each second but this seems extremely inefficient. Any tips would be appreciated.

your example output doesn't make any sense. i'm confused why you say you'd like to resample to 15-sec and 60-sec intervals, but then resample and show output for a 30-sec interval. — Paul H
@PaulH thanks for catching an error in my question. I have updated the output section — Steven Bayer

Laura Laura · Accepted Answer · 2019-02-05T16:28:36

If you want value/time unit, you should divide one by the other first.

interval = 30
df['mean_value'] = (df['value']/df['length'].apply(lambda x: x.total_seconds()/interval))
result = df['mean_value'].resample(str(interval)+'s').pad()

Working with unevenly spaced time series data

2 Answers