11
votes

Tensorflow has a function called batch_matmul which multiplies higher dimensional tensors. But I'm having a hard time understanding how it works, perhaps partially because I'm having a hard time visualizing it.

enter image description here

What I want to do is multiply a matrix by each slice of a 3D tensor, but I don't quite understand what the shape of tensor a is. Is z the innermost dimension? Which of the following is correct?

enter image description here

I would most prefer the first to be correct -- it's most intuitive to me and easy to see in the .eval() output. But I suspect the second is correct.

Tensorflow says that batch_matmul performs:

out[..., :, :] = matrix(x[..., :, :]) * matrix(y[..., :, :])

What does that mean? What does that mean in the context of my example? What is being multiplied with with what? And why aren't I getting a 3D tensor the way I expected?

6

6 Answers

21
votes

You can imagine it as doing a matmul over each training example in the batch.

For example, if you have two tensors with the following dimensions:

a.shape = [100, 2, 5]
b.shape = [100, 5, 2]

and you do a batch tf.matmul(a, b), your output will have the shape [100, 2, 2].

100 is your batch size, the other two dimensions are the dimensions of your data.

16
votes

First of all tf.batch_matmul() was removed and no longer available. Now you suppose to use tf.matmul():

The inputs must be matrices (or tensors of rank > 2, representing batches of matrices), with matching inner dimensions, possibly after transposition.

So let's assume you have the following code:

import tensorflow as tf
batch_size, n, m, k = 10, 3, 5, 2
A = tf.Variable(tf.random_normal(shape=(batch_size, n, m)))
B = tf.Variable(tf.random_normal(shape=(batch_size, m, k)))
tf.matmul(A, B)

Now you will receive a tensor of the shape (batch_size, n, k). Here is what is going on here. Assume you have batch_size of matrices nxm and batch_size of matrices mxk. Now for each pair of them you calculate nxm X mxk which gives you an nxk matrix. You will have batch_size of them.

Notice that something like this is also valid:

A = tf.Variable(tf.random_normal(shape=(a, b, n, m)))
B = tf.Variable(tf.random_normal(shape=(a, b, m, k)))
tf.matmul(A, B)

and will give you a shape (a, b, n, k)

4
votes

You can now do it using tf.einsum, starting from Tensorflow 0.11.0rc0.

For example,

M1 = tf.Variable(tf.random_normal([2,3,4]))
M2 = tf.Variable(tf.random_normal([5,4]))  
N = tf.einsum('ijk,lk->ijl',M1,M2)       

It multiplies the matrix M2 with every frame (3 frames) in every batch (2 batches) in M1.

The output is:

[array([[[ 0.80474716, -1.38590837, -0.3379252 , -1.24965811],
        [ 2.57852983,  0.05492432,  0.23039417, -0.74263287],
        [-2.42627382,  1.70774114,  1.19503212,  0.43006262]],

       [[-1.04652011, -0.32753903, -1.26430523,  0.8810069 ],
        [-0.48935518,  0.12831448, -1.30816901, -0.01271309],
        [ 2.33260512, -1.22395933, -0.92082584,  0.48991606]]], dtype=float32),
array([[ 1.71076882, 0.79229093, -0.58058828, -0.23246667],
       [ 0.20446332,  1.30742455, -0.07969904,  0.9247328 ],
       [-0.32047141,  0.66072595, -1.12330854,  0.80426538],
       [-0.02781649, -0.29672042,  2.17819595, -0.73862702],
       [-0.99663496,  1.3840003 , -1.39621222,  0.77119476]], dtype=float32), 
array([[[ 0.76539308, 2.77609682, -1.79906654,  0.57580602, -3.21205115],
        [ 4.49365759, -0.10607499, -1.64613271,  0.96234947, -3.38823152],
        [-3.59156275,  2.03910899,  0.90939498,  1.84612727,  3.44476724]],

       [[-1.52062428,  0.27325237,  2.24773455, -3.27834225,  3.03435063],
        [ 0.02695178,  0.16020992,  1.70085776, -2.8645196 ,  2.48197317],
        [ 3.44154787, -0.59687197, -0.12784094, -2.06931567, -2.35522676]]], dtype=float32)]

I have verified, the arithmetic is correct.

2
votes

tf.tensordot should solve this problem. It supports batch operations, e.g., if you want to contract a 2D tensor with a 3D tensor, with the latter having a batch dimension.

If a is shape [n,m] b is shape [?,m,l], then

y = tf.tensordot(b, a, axes=[1, 1]) will produce a tensor of shape [?,n,l]

https://www.tensorflow.org/api_docs/python/tf/tensordot

-1
votes

It is simply like splitting on the first dimension respectively, multiply and concat them back. If you want to do 3D by 2D, you can reshape, multiply, and reshape it back. I.e. [100, 2, 5] -> [200, 5] -> [200, 2] -> [100, 2, 2]

-1
votes

The answer to this particular answer is using tf.scan function.

If a = [5,3,2] #dimension of 5 batch, with 3X2 mat in each batch
and b = [2,3] # a constant matrix to be multiplied with each sample

then let def fn(a,x): return tf.matmul(x,b)

initializer = tf.Variable(tf.random_number(3,3))

h = tf.scan(fn,outputs,initializer)

this h will store all the outputs.