0
votes

I have a netCDF file for seasonal data. When loaded into Dataset, it contains season, latitude and longitude dimensions.

print(dataset_seasonal_nc)


<xarray.Dataset>
Dimensions:               (latitude: 106, longitude: 193, season: 4)
Coordinates:
  * latitude              (latitude) float32 -39.2 -39.149525 ... -33.9
  * longitude             (longitude) float32 140.8 140.84792 ... 150.0
  * season                (season) object 'DJF' 'JJA' 'MAM' 'SON'
Data variables:
    FFDI 95TH PERCENTILE  (season, latitude, longitude) float64 dask.array<shape=(4, 106, 193), chunksize=(4, 106, 193)>

I need to upsample the seasnonal data to daily data for 10 years (for example from 1972 to 1981, 3653 days in total). This means the upsampled Dataset object should be:

<xarray.Dataset>
Dimensions:    (latitude: 106, longitude: 193, time: 3653)
Coordinates:
  * latitude   (latitude) float32 -39.2 -39.149525 ... -33.950478 -33.9
  * longitude  (longitude) float32 140.8 140.84792 140.89584 ... 149.95209 150.0
  * time       (time) datetime64[ns] 1972-01-01T00:00:00 1972-01-02T00:00:00 1972-01-03T00:00:00 ... 1981-12-30T00:00:00 1981-12-31T00:00:00
Data variables:
    FFDI 95TH PERCENTILE  (time, latitude, longitude) float64 dask.array<shape=(3653, 106, 193), chunksize=(3653, 106, 193)>

The variable for a day should be the same as the variable for the season that the day falls in. This means, 1972-01-01, 1972-02-02 and 1972-02-28 should have the same value as the season DJF has; and 1972-04-01, 1972-05-02 and 1972-05-31 should have the same value as the season MAM has.

I was trying to use the Dataset's resample function:

upsampled = dataset_seasonal_nc.resample(time='D').ffill()

But this gave me the following error:

...\venv\lib\site-packages\xarray\core\dataset.py", line 896, in _construct_dataarray
    variable = self._variables[name]
KeyError: 'time'
1
This doesn't look like a pandas dataframe to me. Furthermore you should post some executable data snippet with which we can work. And where is your problem? Which step can't be executed? At which step do you need help? And most important: What have you tried so far? - JE_Muc
This is a xarray Dataset as the netCDF was loaded into xarray. I don't know where to start with. - alextc
@Scotty1- I don't think the OP ever mentioned pandas. - FChm
@FChm It is in the tags... And resampling/upsampling is commonly done in pandas, especially when working with netCDF files. - JE_Muc
I think xarray's resample function is a wrapper around the pandas capabilities. I would like to do this with xarray. - alextc

1 Answers

0
votes

This seems like a good candidate for xarray's advanced label-based indexing. I think something like the following should work:

import pandas as pd

times = pd.date_range('1972', '1982', freq='D', closed='left')
time = xr.DataArray(times, [('time', times)])
upsampled = dataset_seasonal_nc.sel(season=time.dt.season)

Here time.dt.season is a DataArray representing the season labels associated with each time in your upsampled Dataset:

In [16]: time.dt.season
Out[16]:
<xarray.DataArray 'season' (time: 3653)>
array(['DJF', 'DJF', 'DJF', ..., 'DJF', 'DJF', 'DJF'],
      dtype='|S3')
Coordinates:
  * time     (time) datetime64[ns] 1972-01-01 1972-01-02 1972-01-03 ...