7
votes

I have a neural network with N input nodes and N output nodes, and possibly multiple hidden layers and recurrences in it but let's forget about those first. The goal of the neural network is to learn an N-dimensional variable Y*, given N-dimensional value X. Let's say the output of the neural network is Y, which should be close to Y* after learning. My question is: is it possible to get the inverse of the neural network for the output Y*? That is, how do I get the value X* that would yield Y* when put in the neural network? (or something close to it)

A major part of the problem is that N is very large, typically in the order of 10000 or 100000, but if anyone knows how to solve this for small networks with no recurrences or hidden layers that might already be helpful. Thank you.

5

5 Answers

3
votes

If you can choose the neural network such that the number of nodes in each layer is the same, and the weight matrix is non-singular, and the transfer function is invertible (e.g. leaky relu), then the function will be invertible.

This kind of neural network is simply a composition of matrix multiplication, addition of bias and transfer function. To invert, you'll just need to apply the inverse of each operation in the reverse order. I.e. take the output, apply the inverse transfer function, multiply it by the inverse of the last weight matrix, minus the bias, apply the inverse transfer function, multiply it by the inverse of the second to last weight matrix, and so on and so forth.

2
votes

This is a task that maybe can be solved with autoencoders. You also might be interested in generative models like Restricted Boltzmann Machines (RBMs) that can be stacked to form Deep Belief Networks (DBNs). RBMs build an internal model h of the data v that can be used to reconstruct v. In DBNs, h of the first layer will be v of the second layer and so on.

1
votes

zenna is right. If you are using bijective (invertible) activation functions you can invert layer by layer, subtract the bias and take the pseudoinverse (if you have the same number of neurons per every layer this is also the exact inverse, under some mild regularity conditions). To repeat the conditions: dim(X)==dim(Y)==dim(layer_i), det(Wi) not = 0

An example: Y = tanh( W2*tanh( W1*X + b1 ) + b2 ) X = W1p*( tanh^-1( W2p*(tanh^-1(Y) - b2) ) -b1 ), where W2p and W1p represent the pseudoinverse matrices of W2 and W1 respectively.

1
votes

The following paper is a case study in inverting a function learned from Neural Networks. It is a case study from the industry and looks a good beginning for understanding how to go about setting up the problem.

1
votes

An alternate way of approaching the task of getting the desired x that yields desired y would be start with random x (or input as seed), then through gradient decent (similar algorithm to back propagation, difference being that instead of finding derivatives of weights and biases, you find derivatives of x. Also, mini batching is not needed.) repeatedly adjust x until it yields a y that is close to the desired y. This approach has an advantage that it allows an input of a seed (starting x, if not randomly selected). Also, I have a hypothesis that the final x will have some similarity to initial x(seed), which would imply that this algorithm has the ability to transpose, depending on the context of the neural network application.