3
votes

I have a netcdf file that uses multidimensional coordinates. My xarray dataset looks like this

<xarray.Dataset>
Dimensions:           (Time: 48, bottom_top: 50, bottom_top_stag: 51, 
soil_layers_stag: 4, south_north: 1015, south_north_stag: 1016, west_east: 1359, west_east_stag: 1360)
Coordinates:
XLAT              (Time, south_north, west_east) float32 18.1363 18.1456 ...
XLAT_U            (Time, south_north, west_east_stag) float32 18.1316 ...
XLAT_V            (Time, south_north_stag, west_east) float32 18.1198 ...
XLONG             (Time, south_north, west_east) float32 -122.884 ...
XLONG_U           (Time, south_north, west_east_stag) float32 -122.901 ...
XLONG_V           (Time, south_north_stag, west_east) float32 -122.879 ...
  * Time              (Time) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ...
  * south_north       (south_north) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ...
  * west_east         (west_east) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ...
  * bottom_top        (bottom_top) int64 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 ...

Data variables:
GRAUPEL_ACC_NC    (Time, south_north, west_east) float32 0.0 0.0 0.0 0.0 ...
P                 (Time, south_north, west_east) float32 101112.0 ...
PREC_ACC_NC       (Time, south_north, west_east) float32 0.0 0.0 0.0 0.0 ...
QVAPOR            (Time, south_north, west_east) float32 0.0120251 ...
SNOW_ACC_NC       (Time, south_north, west_east) float32 0.0 0.0 0.0 0.0 ...
TK                (Time, south_north, west_east) float32 295.372 295.367 ...
Z                 (Time, south_north, west_east) float32 0.0 0.0 0.0 0.0 ...

I am hoping to get a subset of the data through the physical coordinates (XLAT & XLONG). For example subsetting TK to get the gridpoints that fall within 49 to 55N and -125 to -115W.

Slicing the data does not work e.g. TK[782:898,179:409] because the gridpoints sliced do not follow the constant lines of latitude and longitude which I need.

There was an example using groupby.bins, however I have not been able to figure it out at all. I also attempted using where to mask the values outside of my domain, with no success.

If anyone has any suggestions, that would be greatly appreciated!

1

1 Answers

7
votes

This is a perfect use-case for where with drop=True. Something like the following should work:

ds.where((-125 < ds.XLON) & (ds.XLON < -115)
         & (49 < ds.XLAT) & (ds.XLAT < 55), drop=True)

where should work regardless, but one other concern with your dataset is that your spatial coordinates (XLON and XLAT) include "Time" as a dimension. Do these variables really vary over time? If not, you may want to adjust them to remove the time dimension.