
I am confused about how to reconstruct the following Pytorch code in TensorFlow. It uses both the input size x and the hidden size h to create a GRU layer

import torch
torch.nn.GRU(64, 64*2, batch_first=True, return_state=True) 

Instinctively, I first tried the following:

import tensorflow as tf
tf.keras.layers.GRU(64, return_state=True)

However, I realize that it does not really account for h or the hidden size. What should I do in this case?


1 Answers


The hidden size is 64 in your tensorflow example. To get the equivalent, you should use

import tensorflow as tf
tf.keras.layers.GRU(64*2, return_state=True)

This is because the keras layer does not require you to specify your input size (64 in this example); it is decided when you build or run your model for the first time.