0
votes

I am trying to resample my data annually, but struggle to set the start day of resampling.

import xarray as xr
import numpy as np
import pandas as pd

da = xr.DataArray(
    np.linspace(0, 11, num=36),
    coords=[
        pd.date_range(
            "15/12/1999", periods=36,
        )
    ],
    dims="time",
)
da.resample(time="1Y").mean()

What I am trying to achieve is to get the means of the following periods: 15/12/1999-15/12/2000, 15/12/2000-15/12/2001, 15/12/2001-15/12/2002, ...

1

1 Answers

0
votes

I have solved it by shifting the time to the first month and use the corresponding pandas anchored offset. Afterwards, reset the time back.

import xarray as xr
import numpy as np
import pandas as pd

da = xr.DataArray(
    np.concatenate([np.zeros(365), np.ones(365)]),
    coords=[
        pd.date_range(
            "06/15/2017", "06/14/2019", freq='D'
        )
    ],
    dims="time",
)

days_to_first_of_month = pd.Timedelta(days=int(da.time.dt.day[0])-1)
da['time'] = da.time - days_to_first_of_month
month = da.time.dt.strftime("%b")[0].values
resampled = da.resample(time=f'AS-{month}').sum()
resampled['time'] = resampled.time + days_to_first_of_month
print(resampled)

Is there a more efficient or clean way?