1
votes

I have a netcdf file with float values representing chlorophyll concentration at latitudes and longitudes. I am trying to draw a line between two sets of lats/lons and return all chlorophyll values from points on the line.

I'm approaching it from a geometry point of view: for points (x1, y1) and (x2, y2), find the slope and intercept of the line and return all values of x for given values of y on the line. Once I have all x and y values (longitude and latitude) I hope to input those into the xarray select method to return the chlorophyll concentration.

ds = '~/apr1.nc'
ds = xarray.open_dataset(ds, decode_times=False)

x1, y1 = [34.3282, 32.4791]
x2, y2 = [34.7, 32.21]

slope = (y2 - y1) / (x2 - x1)
intercept = y1 - (slope * x1)

line_lons = np.arange(x1, x2, step)
line_lats = [slope * x + intercept for x in lons]
values = ds.CHL.sel(lat=line_lats, lon=line_lons, method='nearest')
ds.values

>>> [0.0908799 , 0.06634101, 0.07615771, 0.16289435],
            [0.06787204, 0.07480557, 0.0655338 , 0.06064864],
            [0.06352911, 0.06586582, 0.06702182, 0.10024723],
            [0.0789495 , 0.07035938, 0.07455409, 0.08405576]]], dtype=float32)

line_lons
>>> array([34.3282, 34.4282, 34.5282, 34.6282])

I want to create a plot with longitudes on the x axis, and values on the y axis. The problem is that the ds.values command returns an numpy data array with a shape of (1, 4, 4) while the longitudes are only 4. There are way more values in the returned array.

plt.plot(line_lons, chlvalues.values)

Any idea why that is and how I can return one value for one input?

Thanks.

1

1 Answers

1
votes

I assume it is because by default your output was taken from box instead of along a selected transect.

I propose a more complex solution with Numpy and netCDF4, where you first make the transect with random coordinates and then turn these random coordinates into the closest unique coordinates from input file (unique = so that each point along the transect is encounted only once).

Afterwards, when you know your output coordinates, you have 2 possibilities how to take out data along transect:

a) you find the indices of the corresponding coordinates b) interpolate original data to those coordinates (either nearest or bi-linear method)

Here is the code:

#!/usr/bin/env ipython
# --------------------------------------------------------------------------------------------------------------
import numpy as np
from netCDF4 import Dataset
# -----------------------------
# coordinates:
x1, y1 = [10., 55.]
x2, y2 = [20., 58.]
# --------------------------------
# ==============================================================================================================
# create some test data:
nx,ny = 100,100
dataout = np.random.random((ny,nx));
# -------------------------------
lonout=np.linspace(9.,30.,nx);
latout=np.linspace(54.,66.,ny);
# make data:
ncout=Dataset('test.nc','w','NETCDF3_CLASSIC');
ncout.createDimension('lon',nx);
ncout.createDimension('lat',ny);
ncout.createDimension('time',None);
ncout.createVariable('lon','float64',('lon'));ncout.variables['lon'][:]=lonout;
ncout.createVariable('lat','float64',('lat'));ncout.variables['lat'][:]=latout;
ncout.createVariable('var','float32',('lat','lon'));ncout.variables['var'][:]=dataout;
ncout.close()
#=================================================================================================================
# CUT THE  DATA FROM FILE:
# make some arbitrary line between start-end point, later let us convert it to indices:
coords=np.linspace(x1+1j*y1,x2+1j*y2,1000);
xo=np.real(coords);yo=np.imag(coords);
# ------------------------------------------------------
# get transect:
ncin = Dataset('test.nc');
lonin=ncin.variables['lon'][:];
latin=ncin.variables['lat'][:];
# ------------------------------------------------------
# get the transect indices:
rxo=np.array([np.squeeze(np.min(lonout[np.where(np.abs(lonout-val)==np.abs(lonout-val).min())])) for val in xo]);
ryo=np.array([np.squeeze(np.min(latout[np.where(np.abs(latout-val)==np.abs(latout-val).min())])) for val in yo]);
rcoords=np.unique(rxo+1j*ryo);
rxo=np.real(rcoords);ryo=np.imag(rcoords);
# ------------------------------------------------------
ixo=[int(np.squeeze(np.where(lonin==val))) for val in rxo];
jxo=[int(np.squeeze(np.where(latin==val))) for val in ryo];
# ------------------------------------------------------
# get var data along transect:
trans_data=np.array([ncin.variables['var'][jxo[ii],ixo[ii]] for ii in range(len(ixo))]);
# ------------------------------------------------------
ncin.close()
# ================================================================================================================
# Another solution using interpolation, when we already know the target coordinates (original coordinates along the transect):
from scipy.interpolate import griddata
ncin = Dataset('test.nc');
lonin=ncin.variables['lon'][:];
latin=ncin.variables['lat'][:];
varin=ncin.variables['var'][:];
ncin.close()
# ----------------------------------------------------------------------------------------------------------------
lonm,latm = np.meshgrid(lonin,latin);
trans_data_b=griddata((lonm.flatten(),latm.flatten()),varin.flatten(),(rxo,ryo),'nearest')