Given three numpy arrays: one multidimensional array x
, one vector y
with trailing singleton dimension, and one vector z
without trailing singleton dimension,
x = np.zeros((M,N))
y = np.zeros((M,1))
z = np.zeros((M,))
the behaviour of broadcast operations changes depending on vector representation and context:
x[:,0] = y # error cannot broadcast from shape (M,1) into shape (M)
x[:,0] = z # OK
x[:,0] += y # error non-broadcastable output with shape (M) doesn't match
# broadcast shape(M,M)
x[:,0] += z # OK
x - y # OK
x - z # error cannot broadcast from shape (M,N) into shape (M)
I realize I can do the following:
x - z[:,None] # OK
but I don't understand what this explicit notation buys me. It certainly doesn't buy readability. I don't understand why the expression x - y
is OK, but x - z
is ambiguous.
Why does Numpy treat vectors with or without trailing singleton dimensions differently?
edit: The documentation states that: two dimensions are compatible when they are equal, or one of them is 1, but y
and z
are both functionally M x 1
vectors, since an M x 0
vector doesn't contain any elements.
x[:,0]
is 1d.x[:,[0]]
is (M,1).z[:]=y
probably gives the same error. – hpauljz
can be reshaped to (M,1) or (1,M) without copying. But functionally it is closer to (1,M), because of the automatic extension at the beginning. – hpaulj