I'm trying to implement this recurrent neural network (it's a Voice Activity Detector):
Note that those blue circles are individual neurons - they don't represent many neurons. It's a really small network. There are some extra details, like what the S's mean and the fact that some layers are quadratic but they don't matter for this question.
I implemented it using Microsoft's CNTK like this (not tested!):
# For the layers with diagonal connections.
QuadraticWithDiagonal(X, Xdim, Ydim)
{
OldX = PastValue(Xdim, 1, X)
OldY = PastValue(Ydim, 1, Y)
Wqaa = LearnableParameter(Ydim, Xdim)
Wqbb = LearnableParameter(Ydim, Xdim)
Wqab = LearnableParameter(Ydim, Xdim)
Wla = LearnableParameter(Ydim, Xdim)
Wlb = LearnableParameter(Ydim, Xdim)
Wlc = LearnableParameter(Ydim, Xdim)
Wb = LearnableParameter(Ydim)
XSquared = ElementTimes(X, X)
OldXSquared = ElementTimes(OldX, OldX)
CrossXSquared = ElementTimes(X, OldX)
T1 = Times(Wqaa, XSquared)
T2 = Times(Wqbb, OldXSquared)
T3 = Times(Wqab, CrossXSquared)
T4 = Times(Wla, X)
T5 = Times(Wlb, OldX)
T6 = Times(Wlc, OldY)
Y = Plus(T1, T2, T3, T4, T5, T6, Wb)
}
# For the layers without diagonal connections.
QuadraticWithoutDiagonal(X, Xdim, Ydim)
{
OldY = PastValue(Ydim, 1, Y)
Wqaa = LearnableParameter(Ydim, Xdim)
Wla = LearnableParameter(Ydim, Xdim)
Wlc = LearnableParameter(Ydim, Xdim)
Wb = LearnableParameter(Ydim)
XSquared = ElementTimes(X, X)
T1 = Times(Wqaa, XSquared)
T4 = Times(Wla, X)
T6 = Times(Wlc, OldY)
Y = Plus(T1, T4, T6, Wb)
}
# The actual network.
# 13x1 input PLP.
I = InputValue(13, 1, tag="feature")
# Hidden layers
H0 = QuadraticWithDiagonal(I, 13, 3)
H1 = QuadraticWithDiagonal(H0, 3, 3)
# 1x1 Pre-output
P = Tanh(QuadraticWithoutDiagonal(H1, 3, 1))
# 5x1 Delay taps
D = QuadraticWithoutDiagonal(P, 1, 5)
# 1x1 Output
O = Tanh(QuadraticWithoutDiagonal(D, 5, 1))
The PastValue()
function gets the value of a layer from the previous time-step. This makes it really easy to implement unusual RNNs like this one.
Unfortunately, although CNTK's Network Description Language is pretty awesome, I find the fact that you can't script the data input, training and evaluation steps rather restrictive. So I'm looking into implementing the same network in Torch or Tensorflow.
Unfortunately, I've read the documentation for both and I have no clue how to implement the recurrent connections. Both libraries seem to equate RNNs with LSTM black boxes that you stack as if they were non-recurrent layers. There doesn't seem to be an equivalent to PastValue()
and all the examples that don't just use a pre-made LSTM layer are completely opaque.
Can anyone show me how to implement a network like this in either Torch or Tensorflow (or both!)?