1
votes

I want to create a NetCDF file with xarray and try to understand the documentation on 'Creating a Dataset' here.

Here is the code from the example (with saving the ds to NetCDF):

import xarray as xr
import numpy as np
import pandas as pd

temp = 15 + 8 * np.random.randn(2, 2, 3)
precip = 10 * np.random.rand(2, 2, 3)
lon = [[-99.83, -99.32], [-99.79, -99.23]]
lat = [[42.25, 42.21], [42.63, 42.59]]

ds = xr.Dataset({'temperature': (['x', 'y', 'time'],  temp),
                 'precipitation': (['x', 'y', 'time'], precip)},
                coords={'lon': (['x', 'y'], lon),
                        'lat': (['x', 'y'], lat), 
                        'time': pd.date_range('2014-09-06', periods=3),
                        'reference_time': pd.Timestamp('2014-09-05')})

ds.to_netcdf('C:\\Temp\\test.nc')

From the example above I would expect to get a NetCDF with two variables (temperature, precipitation) with three dimensions (x, y, time). I expect the dimension sizes to be 2 in x-direction, 2 in y direction and 3 in time direction. According to @Bart's test (comments), this is the case for the NetCDF. So, when opening the NetCDF 'temp' or 'precip' variable in QGIS 3.4 (EPSG 4326), I would expect the NetCDF data is:

  • made of a 2x2 grid ('x-y' direction)
  • with 3 time steps ('time' or 'z'-direction)
  • located at the latitude longitude used for creating the Dataset

Instead the data is:

  • made of a 2x3 grid
  • with 2 bands (instead of the three time steps)
  • located at lat/lon 0/0 and each cell is 1 degree large:

Please find the visualization of the NetCDF here. The red squares mark the six cells in blue shades of the temperature variable, the lat/lon of the upper left point (0/0) and the two 'bands'

Hence, it seems the xarray NetCDF data format is different to the QGIS interpretation of that format. Can someone edit the example so that it produces a 2x2 grid in the 'correct' location in QGIS or provide / point me to a simple example to create a correctly georeferenced NetCDF from an xarray dataset?

1
I don't understand what you mean with the "2x3 grid", " with 2 bands", "with 3 time dimensions", ... You only create 1 time dimension, why would you expect 3? You create variables with 3 dimensions (x, y, time), why would you expect to get a "2x2 grid"? Can you describe how your resulting NetCDF file should look like in terms of variables, dimensions and dimension sizes?Bart
Sorry, my usage of 'dimension' was not correct. I edited the question. Indeed, I expect only one time dimension, but with three time steps ('time': pd.date_range('2014-09-06', periods=3). I expect a "2x2 grid" because 'temp' is created through np.random.randn(2, 2, 3) and the xr.Dataset is created through 'temperature': (['x', 'y', 'time'], temp). So, I expect 'x' to relate to '2', 'y' to '2' and 'time' to '3'.openwater
I'm sorry, but I still don't understand. From your question: "From the example above I would expect to get a NetCDF with two variables (temperature, precipitation) with three dimensions (x, y, time). I expect the dimension sizes to be 2 in x-direction, 2 in y direction and 3 in time direction."; if I run your code, I get exactly that, with the variables on the lat/lon coordinates that you specify...?Bart
An ncdump of the resulting NetCDF file: pastebin.com/wDRN9UB1Bart
Thanks @Bart. Would upvote your comments, but can't due to lack of reputation.openwater

1 Answers

0
votes

I had to swap the order of the dimensions and I had to get rid of the nested lat lon list for it to work. Modifying the xarray example to this code here works:

import xarray as xr
import numpy as np
import pandas as pd

temp = 15 + 8 * np.random.randn(3, 3, 2)  # time, y, x
precip = 10 * np.random.rand(3, 3, 2)  # time, y, x
lat = [100, 101, 102]
lon = [10, 11]

ds = xr.Dataset({'temperature': (['time', 'y', 'x'],  temp),
                 'precipitation': (['time', 'y', 'x'], precip)},
                coords={'x': (['x'], lon),
                        'y': (['y'], lat),
                        'time': pd.date_range('2014-09-06', periods=3)})

ds.to_netcdf('C:\\Temp\\test_xarray.nc')

Please note that I changed the "2x2 grid" as requested from my question to "2x3 grid" to make sure the order of x-y in creating the NetCDF matters.