7
votes

I've often read, that there are fundamental differences between feed-forward and recurrent neural networks (RNNs), due to the lack of an internal state and a therefore short-term memory in feed-forward networks. This seemed plausible to me at first sight.

However when learning a recurrent neural network with the Backpropagation through time algorithm recurrent networks are transformed into equivalent feed forward networks, if I understand correctly.

This would imply, that there is in fact no fundamental difference. Why do RNNs perform better in certain tasks (image recognition, time-series prediction, ...) than deep feed forward networks?

1

1 Answers

4
votes

The fact that training is done using some trick, does not change the fact, that there is a fundamental difference in the preservation of the network state, which is absent in the feed-forward network.

The "unrolled" feed forward network is not equivalent to the recurrent network. It is only a markov approximation (to the level given by the number of "unrolled" levels). So you just "simulate" the recurrent network with k step memory, while the actual recurrent neural network has (in theory) unlimited memory.