Is there a way to do matrix-by-tensor multiply?

Question

Is there any way to get tensor (with batch dimension) multiplication behavior similar to tf.matmul between 2D-matrix where the batch dimension equal to one ?

Specifically, I want to do 2D-matrix (6,255) and Tensor (2,255, 255,1) (with batch dimension equal to 2), where :

import tensorflow as tf
import numpy as np

im = np.random.rand(255, 255)
A = np.random.rand(6, 255)
B = np.array([im,im]).reshape([-1,255, 255,1])

batch_size = 2
a = tf.placeholder(tf.float64,shape=(6, 255))
b = tf.placeholder(tf.float64,shape=(batch_size,255, 255,1))
out_mat = tf.matmul(a,b) #Didn't work 
with tf.Session() as sess:
    sess.run(out_mat, feed_dict={a: A, b: B})

and the result should have (2, 6, 255, 1) (Thanks you @rvinas ) shape.

Note: In tensorflow, matmul can only handle 2D-matrices and batch_matmul can only do (...,m,n) by (...,n,p) where ... is the same in both A,B.

How would you get an output shape of (2, 6, 6, 1)? Did you mean (2, 6, 255, 1)? Could you please provide an example using NumPy? — rvinas
@rvinas, You are right I mean (2, 6, 255, 1), Let's consider numpy example without batch size=2, im = np.random.rand(255, 255) A = np.random.rand(6, 255) B = im C = np.matmul(A,B) C shape will be (6,255), I'm trying to reproduce the same results for tensors when B shape is (2, 255, 255, 1) — P. Max

rvinas rvinas · Accepted Answer · 2018-10-23T07:18:09

Here's one way to do it using implicit broadcasting and tf.reduce_sum:

import tensorflow as tf
import numpy as np

batch_size = 2
dim_1 = 3
dim_2 = 4
dim_3 = 5
dim_4 = 32

im = np.arange(dim_2 * dim_3).reshape(dim_2, dim_3)
A = np.arange(dim_1 * dim_2).reshape(dim_1, dim_2)
B = im
C = np.matmul(A, B)
print('NumPy result (for batch_size=1):\n {}'.format(C))

B = np.repeat(B[None, ..., None], batch_size, axis=0)
B = np.repeat(B, dim_4, axis=3)
print(B.shape)  # B shape=(batch_size, dim_2, dim_3, dim_4)

a = tf.placeholder(tf.float64, shape=(dim_1, dim_2))
b = tf.placeholder(tf.float64, shape=(batch_size, dim_2, dim_3, dim_4))
a_ = a[None, :, :, None, None]  # Shape=(1, dim_1, dim_2, 1, 1)
b_ = b[:, None, :, :, :]  # Shape=(batch_size, 1, dim_2, dim_3, dim_4)
out_mat = tf.reduce_sum(a_ * b_, axis=2)

with tf.Session() as sess:
    c = sess.run(out_mat, feed_dict={a: A, b: B})
    print('TF result (for batch_size={}):\n {}'.format(batch_size, c))
    assert c.shape == (batch_size, dim_1, dim_3, dim_4)

And an alternative way using tf.matmul, tf.reshape and tf.transpose:

b_ = tf.transpose(b, [1, 0, 2, 3])  # Shape=(dim_2, batch_size, dim_3, dim_4)
b_ = tf.reshape(b_, [dim_2, -1])  # Shape=(dim_2, batch_size * dim_3 * dim_4)
matmul = a @ b_  # Shape=(dim_1, batch_size * dim_3 * dim_4)
matmul_ = tf.reshape(matmul, [dim_1, batch_size, dim_3, dim_4])
out_mat = tf.transpose(matmul_, [1, 0, 2, 3])  # Shape=(batch_size, dim_1, dim_3, dim_4)

For your particular example, you would set batch_size=2, dim_1=6, dim_2=255, dim_3=255 and dim_4=1.

Is there a way to do matrix-by-tensor multiply?

1 Answers