I was looking at the docs of tensorflow about tf.nn.conv2d
here. But I can't understand what it does or what it is trying to achieve. It says on the docs,
#1 : Flattens the filter to a 2-D matrix with shape
[filter_height * filter_width * in_channels, output_channels]
.
Now what does that do? Is that element-wise multiplication or just plain matrix multiplication? I also could not understand the other two points mentioned in the docs. I have written them below :
# 2: Extracts image patches from the the input tensor to form a virtual tensor of shape
[batch, out_height, out_width, filter_height * filter_width * in_channels]
.# 3: For each patch, right-multiplies the filter matrix and the image patch vector.
It would be really helpful if anyone could give an example, a piece of code (extremely helpful) maybe and explain what is going on there and why the operation is like this.
I've tried coding a small portion and printing out the shape of the operation. Still, I can't understand.
I tried something like this:
op = tf.shape(tf.nn.conv2d(tf.random_normal([1,10,10,10]),
tf.random_normal([2,10,10,10]),
strides=[1, 2, 2, 1], padding='SAME'))
with tf.Session() as sess:
result = sess.run(op)
print(result)
I understand bits and pieces of convolutional neural networks. I studied them here. But the implementation on tensorflow is not what I expected. So it raised the question.
EDIT: So, I implemented a much simpler code. But I can't figure out what's going on. I mean how the results are like this. It would be extremely helpful if anyone could tell me what process yields this output.
input = tf.Variable(tf.random_normal([1,2,2,1]))
filter = tf.Variable(tf.random_normal([1,1,1,1]))
op = tf.nn.conv2d(input, filter, strides=[1, 1, 1, 1], padding='SAME')
init = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
print("input")
print(input.eval())
print("filter")
print(filter.eval())
print("result")
result = sess.run(op)
print(result)
output
input
[[[[ 1.60314465]
[-0.55022103]]
[[ 0.00595062]
[-0.69889867]]]]
filter
[[[[-0.59594476]]]]
result
[[[[-0.95538563]
[ 0.32790133]]
[[-0.00354624]
[ 0.41650501]]]]
tf.nn.conv2d()
, so the method in question is not used at all when we use TF with GPU support, unlessuse_cudnn_on_gpu=False
is specified explicitly. – gkcn