0
votes

I'm working in an algorithm to match two kind of objects (lets say balls and buckets). Each object is modeled as a 4D numpy array, and each kind is grouped within another array. My method is based on calculating all possible differences between each pair (ball, bucket) and applying a similarity function on that difference.

I'm trying to avoid for loops since speed is really relevant for what I'm doing, so I'm creating those differences by reshaping one of the initial arrays, broadcasting numpy operations and creating a 3D numpy array (diff_map). I'm not finding any good tutorial about this, so I'd like to know if there is a more "proper way" to do that. Id also like to see any good references about this kind of operation (multidimensional reshape and broadcast) if possible.

My code:

import numpy as np

balls = np.random.rand(3,4)
buckets = np.random.rand(6,4)
buckets = buckets.reshape(len(buckets), 1, 4)
buckets
array([[[ 0.38382622,  0.27114067,  0.63856317,  0.51360638]],

   [[ 0.08709269,  0.21659216,  0.31148519,  0.99143705]],

   [[ 0.03659845,  0.78305241,  0.87699971,  0.78447545]],

   [[ 0.11652137,  0.49490129,  0.76382286,  0.90313785]],

   [[ 0.62681395,  0.10125169,  0.61131263,  0.15643676]],

   [[ 0.97072113,  0.56535597,  0.39471204,  0.24798229]]])

diff_map = balls-buckets
diff_map.shape
(6, 3, 4)

For Loop

As requested, this is the for loop I'm trying to avoid:

diff_map_for = np.zeros((len(buckets), len(balls), 4))
for i in range(len(buckets)):
    for j in range(len(balls)):
        diff_map_for[i, j] = buckets[i]-balls[j]

`Just to be sure, let's compare the two results:

np.all(diff_map == diff_map_for)
True
1
Share your loopy code? We know you are trying to avoid that, but it would be easier to understand what you want looking at your loopy code.Divakar
So, what exactly is the question again? Are you asking why that reshaped version : balls-buckets works? Look into broadcasting docs, as mentioned earlier as well.Divakar

1 Answers

0
votes

Does this work for you?

import numpy as np

balls = np.random.rand(3,4)
buckets = np.random.rand(6,4)

diff_map = buckets[:, np.newaxis, :] - balls[np.newaxis, :, :]
print(diff_map.shape)
# output: (6, 3, 4)

# ... compared to for loop
diff_map_for = np.zeros((len(buckets), len(balls), 4))
for i in range(len(buckets)):
    for j in range(len(balls)):
        diff_map_for[i, j] = buckets[i] - balls[j]

print(np.sum(diff_map - diff_map_for))
# output: 0.0