9
votes

Given three numpy arrays: one multidimensional array x, one vector y with trailing singleton dimension, and one vector z without trailing singleton dimension,

x = np.zeros((M,N))
y = np.zeros((M,1))
z = np.zeros((M,))

the behaviour of broadcast operations changes depending on vector representation and context:

x[:,0] = y      # error cannot broadcast from shape (M,1) into shape (M)
x[:,0] = z      # OK

x[:,0] += y     # error non-broadcastable output with shape (M) doesn't match 
                # broadcast shape(M,M)
x[:,0] += z     # OK

x - y           # OK
x - z           # error cannot broadcast from shape (M,N) into shape (M)

I realize I can do the following:

x - z[:,None]   # OK

but I don't understand what this explicit notation buys me. It certainly doesn't buy readability. I don't understand why the expression x - y is OK, but x - z is ambiguous.

Why does Numpy treat vectors with or without trailing singleton dimensions differently?

edit: The documentation states that: two dimensions are compatible when they are equal, or one of them is 1, but y and z are both functionally M x 1 vectors, since an M x 0 vector doesn't contain any elements.

3
A key point is that x[:,0] is 1d. x[:,[0]] is (M,1). z[:]=y probably gives the same error.hpaulj
z can be reshaped to (M,1) or (1,M) without copying. But functionally it is closer to (1,M), because of the automatic extension at the beginning.hpaulj

3 Answers

4
votes

The convention is that broadcasting will insert singleton dimensions at the beginning of an array's shape. This makes it convenience to perform operations over the last dimensions of an array, so (x.T - z).T should work.

If it were to automatically decide which axis of x was matched by z, an operation like x - z would result in an error if and only if N == M, making code harder to test. So the convention allows some convenience, while being robust to some error.

If you don't like the z[:, None] shorthand, perhaps you find z[:, np.newaxis] clearer.

For an assignment like x[:,0] = y to work, you can use x[:,0:1] = y instead.

0
votes

Using the Numpy matrix interface as opposed to the array interface yields the desired broadcasting behaviours:

x = np.asmatrix(np.zeros((M,N)))
y = np.asmatrix(np.zeros((M,1)))

x[:,0] = y              # OK
x[:,0] = y[:,0]         # OK
x[:,0] = y[:,0:1]       # OK
x[:,0] += y             # OK
x - y                   # OK
x - np.mean(x, axis=0)  # OK
x - np.mean(x, axis=1)  # OK
0
votes

One benefit of treating (M,1) and (M,) differently is to enable you to specify what dimensions to align and what dimensions to broadcast

Say you have:

a = np.arange(4)
b = np.arange(16).reshape(4,4)
# i.e a = array([0, 1, 2, 3])
# i.e b = array([[ 0,  1,  2,  3],
#   [ 4,  5,  6,  7],
#   [ 8,  9, 10, 11],
#   [12, 13, 14, 15]])

When you do c = a + b, a and b will be aligned in axis=1 and a will be broadcasted along axis=0:

array([[0, 1, 2, 3],
   [0, 1, 2, 3],
   [0, 1, 2, 3],
   [0, 1, 2, 3]])

But what if you want to align a and b in axis=0 and broadcast in axis=1 ?

array([[0, 0, 0, 0],
   [1, 1, 1, 1],
   [2, 2, 2, 2],
   [3, 3, 3, 3]])

(M,1) vs (M,) difference enables you to specify which dimension to align and broadcast.

(i.e if (M,1) and (M,) are treated the same, how do you tell numpy you want to broadcast on axis=1?)