3
votes

I'm trying to create a 2-dimensional array in Scipy/Numpy where each value represents the euclidean distance from the center. It's supposed to have the same shape as the first two dimensions of a 3-dimensional array (an image, created via scipy.misc.fromimage).

Here's one approach that works:

def get_distance_1(y, x):
    mid_x, mid_y = (scipy.array(image_array.shape[:2]) - 1) / float(2)
    return ((y - mid_y) ** 2 + (x - mid_x) ** 2) ** 0.5

distances = scipy.fromfunction(get_distance_1, image_array.shape[:2])

This method is fairly fast, but I'm very new to Scipy, and would like to know if there's a more elegant, idiomatic way of doing the same thing. I found the scipy.spatial.distance.cdist function, which seems promising, but I'm at a loss regarding how to fit it into this problem.

def get_distance_2(y, x):
    mid = ...  # needs to be a array of the shape (rows, cols, 2)?
    return scipy.spatial.distance.cdist(scipy.dstack((y, x)), mid)

Just to clarify, what I'm looking for is something like this (for a 6 x 6 array)

[[ 3.53553391  2.91547595  2.54950976  2.54950976  2.91547595  3.53553391]
 [ 2.91547595  2.12132034  1.58113883  1.58113883  2.12132034  2.91547595]
 [ 2.54950976  1.58113883  0.70710678  0.70710678  1.58113883  2.54950976]
 [ 2.54950976  1.58113883  0.70710678  0.70710678  1.58113883  2.54950976]
 [ 2.91547595  2.12132034  1.58113883  1.58113883  2.12132034  2.91547595]
 [ 3.53553391  2.91547595  2.54950976  2.54950976  2.91547595  3.53553391]]
2
Also check stackoverflow.com/questions/17527340/… (the efficiency of different methods of calculating distances changes when your arrays become very big - the memory starts to get important, and the solution that is fastest is not the obvious one) - read all the comments on the question and the accepted answer, for more clarity - usethedeathstar

2 Answers

2
votes

cdist is the right function. Given two sets of points X and Y, it returns the distance between x and y for all x in X and y in Y. In this case, one of those sets is a singleton:

>>> X = np.random.randn(10, 3)                              # random 3-d points
>>> center = np.random.randn(3)
>>> scipy.spatial.distance.cdist(X, np.atleast_2d(center))  # both must be 2-d
array([[ 2.72130005],
       [ 1.62765189],
       [ 1.14245608],
       [ 2.55279445],
       [ 2.43727709],
       [ 3.20647709],
       [ 1.65028127],
       [ 0.79044422],
       [ 1.8180881 ],
       [ 2.38094952]])

This is a 2-d array, so you might want to ravel it:

>>> scipy.spatial.distance.cdist(X, np.atleast_2d(center)).ravel()
array([ 2.72130005,  1.62765189,  1.14245608,  2.55279445,  2.43727709,
        3.20647709,  1.65028127,  0.79044422,  1.8180881 ,  2.38094952])
2
votes

I'd say the idiomatic way is to vectorize it.

The original function get_distance_1 is probably designed with scalar arguments (single numbers) in mind, but actually works on Numpy arrays as well, without modification. This means you can pass it arrays with the x- and y-indices and it will give the desired result.

import numpy as np

m, n = image_array.shape[:2]
x_inds = np.arange(m)
y_inds = np.arange(n)

distances = get_distance_1(x_inds[:,None], y_inds)

The indexing with None (equivalent to indexing with np.newaxis) adds an axis to the one-dimensional vector and effectively transposes it. This is necessary to engage broadcasting.

A bit shorter would be:

x_inds, y_inds = np.ogrid[:m, :n]

distances = get_distance_1(x_inds, y_inds)

Note you have to reverse the definition of x and y in get_distance_1 to get distances with respect to the center:

def get_distance_1(x, y):
    mid_x, mid_y = (scipy.array(image_array.shape[:2]) - 1) / float(2)
    return ((y - mid_y) ** 2 + (x - mid_x) ** 2) ** 0.5