I've often read, that there are fundamental differences between feed-forward and recurrent neural networks (RNNs), due to the lack of an internal state and a therefore short-term memory in feed-forward networks. This seemed plausible to me at first sight.
However when learning a recurrent neural network with the Backpropagation through time algorithm recurrent networks are transformed into equivalent feed forward networks, if I understand correctly.
This would imply, that there is in fact no fundamental difference. Why do RNNs perform better in certain tasks (image recognition, time-series prediction, ...) than deep feed forward networks?