I'm wondering whether the cuDNN-based RNNs (LSTMs or GRUs), when they are bidirectional and configured with multiple layers, integrate the outputs from both directions at a given layer n before sending it to the next layer n+1, or does each direction work independently of the other? (i.e. forward layers send information just to the forward layers above, and similarly for the backward direction).
I would like it to integrate the outputs from both directions, even though performance-wise it's obviously faster to have each direction run independently as that makes it possible to run all layers simultaneously if memory allows it.