19
votes

I need to interpolate temperature data linearly in 4 dimensions (latitude, longitude, altitude and time).
The number of points is fairly high (360x720x50x8) and I need a fast method of computing the temperature at any point in space and time within the data bounds.

I have tried using scipy.interpolate.LinearNDInterpolator but using Qhull for triangulation is inefficient on a rectangular grid and takes hours to complete.

By reading this SciPy ticket, the solution seemed to be implementing a new nd interpolator using the standard interp1d to calculate a higher number of data points, and then use a "nearest neighbor" approach with the new dataset.

This, however, takes a long time again (minutes).

Is there a quick way of interpolating data on a rectangular grid in 4 dimensions without it taking minutes to accomplish?

I thought of using interp1d 4 times without calculating a higher density of points, but leaving it for the user to call with the coordinates, but I can't get my head around how to do this.

Otherwise would writing my own 4D interpolator specific to my needs be an option here?

Here's the code I've been using to test this:

Using scipy.interpolate.LinearNDInterpolator:

import numpy as np
from scipy.interpolate import LinearNDInterpolator

lats = np.arange(-90,90.5,0.5)
lons = np.arange(-180,180,0.5)
alts = np.arange(1,1000,21.717)
time = np.arange(8)
data = np.random.rand(len(lats)*len(lons)*len(alts)*len(time)).reshape((len(lats),len(lons),len(alts),len(time)))

coords = np.zeros((len(lats),len(lons),len(alts),len(time),4))
coords[...,0] = lats.reshape((len(lats),1,1,1))
coords[...,1] = lons.reshape((1,len(lons),1,1))
coords[...,2] = alts.reshape((1,1,len(alts),1))
coords[...,3] = time.reshape((1,1,1,len(time)))
coords = coords.reshape((data.size,4))

interpolatedData = LinearNDInterpolator(coords,data)

Using scipy.interpolate.interp1d:

import numpy as np
from scipy.interpolate import LinearNDInterpolator

lats = np.arange(-90,90.5,0.5)
lons = np.arange(-180,180,0.5)
alts = np.arange(1,1000,21.717)
time = np.arange(8)
data = np.random.rand(len(lats)*len(lons)*len(alts)*len(time)).reshape((len(lats),len(lons),len(alts),len(time)))

interpolatedData = np.array([None, None, None, None])
interpolatedData[0] = interp1d(lats,data,axis=0)
interpolatedData[1] = interp1d(lons,data,axis=1)
interpolatedData[2] = interp1d(alts,data,axis=2)
interpolatedData[3] = interp1d(time,data,axis=3)

Thank you very much for your help!

4
I have no idea about python, but you should look into quadrilinear or quadricubic interpolation.John Dvorak
Do you have any ideas in any other language about this? Thanksnzapponi
Implementing a quadrilinear interpolation for a single point should be fairly easy in any language.John Dvorak
no they aren't. 3 dimensions are constant (.5, .5, 1), but the altitude dimension is not always sampled at the same point hence the grid is not regular on that axis.nzapponi
Not exactly. I have temperature data on a regular grid of lats, lons, PRESSURES and time. However, I my goal is to have a function I can call this way: getTemperature(lat,lon,ALT,time). What I do have is the ALTITUDE data on the same grid as the temperature, so the same regular grid of lats,lons,pressures,time. Right now, my implementation of the method looks up the pressure indices of the two closest altitude values for the given lat,lon,time. Then it uses those altitude values and the two pressure indices to build up the 4D hypertetrahedron in the temperature matrix and interpolate...nzapponi

4 Answers

13
votes

In the same ticket you have linked, there is an example implementation of what they call tensor product interpolation, showing the proper way to nest recursive calls to interp1d. This is equivalent to quadrilinear interpolation if you choose the default kind='linear' parameter for your interp1d's.

While this may be good enough, this is not linear interpolation, and there will be higher order terms in the interpolation function, as this image from the wikipedia entry on bilinear interpolation shows:

enter image description here

This may very well be good enough for what you are after, but there are applications where a triangulated, really piecewise linear, interpoaltion is preferred. If you really need this, there is an easy way of working around the slowness of qhull.

Once LinearNDInterpolator has been setup, there are two steps to coming up with an interpolated value for a given point:

  1. figure out inside which triangle (4D hypertetrahedron in your case) the point is, and
  2. interpolate using the barycentric coordinates of the point relative to the vertices as weights.

You probably do not want to mess with barycentric coordinates, so better leave that to LinearNDInterpolator. But you do know some things about the triangulation. Mostly that, because you have a regular grid, within each hypercube the triangulation is going to be the same. So to interpolate a single value, you could first determine in which subcube your point is, build a LinearNDInterpolator with the 16 vertices of that cube, and use it to interpolate your value:

from itertools import product

def interpolator(coords, data, point) :
    dims = len(point)
    indices = []
    sub_coords = []
    for j in xrange(dims) :
        idx = np.digitize([point[j]], coords[j])[0]
        indices += [[idx - 1, idx]]
        sub_coords += [coords[j][indices[-1]]]
    indices = np.array([j for j in product(*indices)])
    sub_coords = np.array([j for j in product(*sub_coords)])
    sub_data = data[list(np.swapaxes(indices, 0, 1))]
    li = LinearNDInterpolator(sub_coords, sub_data)
    return li([point])[0]

>>> point = np.array([12.3,-4.2, 500.5, 2.5])
>>> interpolator((lats, lons, alts, time), data, point)
0.386082399091

This cannot work on vectorized data, since that would require storing a LinearNDInterpolator for every possible subcube, and even though it probably would be faster than triangulating the whole thing, it would still be very slow.

5
votes

scipy.ndimage.map_coordinates is a nice fast interpolator for uniform grids (all boxes the same size). See multivariate-spline-interpolation-in-python-scipy on SO for a clear description.

For non-uniform rectangular grids, a simple wrapper Intergrid maps / scales non-uniform to uniform grids, then does map_coordinates. On a 4d test case like yours it takes about 1 μsec per query:

Intergrid: 1000000 points in a (361, 720, 47, 8) grid took 652 msec
2
votes

For very similar things I use Scientific.Functions.Interpolation.InterpolatingFunction.

    import numpy as np
    from Scientific.Functions.Interpolation import InterpolatingFunction

    lats = np.arange(-90,90.5,0.5)
    lons = np.arange(-180,180,0.5)
    alts = np.arange(1,1000,21.717)
    time = np.arange(8)
    data = np.random.rand(len(lats)*len(lons)*len(alts)*len(time)).reshape((len(lats),len(lons),len(alts),len(time)))

    axes = (lats, lons, alts, time)
    f = InterpolatingFunction(axes, data)

You can now leave it to the user to call the InterpolatingFunction with coordinates:

>>> f(0,0,10,3)
0.7085675631375401

InterpolatingFunction has nice additional features, such as integration and slicing.

However, I do not know for sure whether the interpolation is linear. You would have to look in the module source to find out.

0
votes

I can not open this address, and find enough informations about this package