Multiple Inputs for Backpropagation Neural Network

Question

I've been working on this for about a week. There are no errors in my coding, I just need to get algorithm and concept right. I've implemented a neural network consisting of 1 hidden layer. I use the backpropagation algorithm to correct the weights.

My problem is that the network can only learn one pattern. If I train it with the same training data over and over again, it produces the desired outputs when given input that is numerically close to the training data.

training_input:1, 2, 3 training_output: 0.6, 0.25

after 300 epochs....

input: 1, 2, 3 output: 0.6, 0.25

input 1, 1, 2 output: 0.5853, 0.213245

But if I use multiple varying training sets, it only learns the last pattern. Aren't neural networks supposed to learn multiple patterns? Is this a common beginner mistake? If yes then point me in the right direction. I've looked at many online guides, but I've never seen one that goes into detail about dealing with multiple input. I'm using sigmoid for the hidden layer and tanh for the output layer.

+

Example training arrays:

13  tcp telnet  SF  118 2425    0   0   0   0   0   1   0   0   0   0   0   0   0   0   0   0   1   1   0   0   0   0   1   0   0   26  10  0.38    0.12    0.04    0   0   0   0.12    0.3 anomaly

0   udp private SF  44  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   4   3   0   0   0   0   0.75    0.5 0   255 254 1   0.01    0.01    0   0   0   0   0   anomaly

0   tcp telnet  S3  0   44  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   1   1   1   1   0   0   1   0   0   255 79  0.31    0.61    0   0   0.21    0.68    0.6 0   anomaly

The last columns(anomaly/normal) are the expected outputs. I turn everything into numbers, so each word can be represented by a unique integer.

I give the network one array at a time, then I use the last column as the expected output to adjust the weights. I have around 300 arrays like these.

As for the hidden neurons, I tried from 3, 6 and 20 but nothing changed.

+

To update the weights, I calculate the gradient for the output and hidden layers. Then I calculate the deltas and add them to their associated weights. I don't understand how that is ever going to learn to map multiple inputs to multiple outputs. It looks linear.

Please give an example of your "multiple training sets". How many hidden neurons do you use? In which order do you present the training set? — Jonas Bötel
I don't get it. Your previous example suggested 3 inputs, 2 outputs. But the presented training arrays suggest approximately 40 inputs and one output. Which one is true? — Jonas Bötel
The previous example was just an example to illustrate and communicate the idea behind my problem. The added example training data is the actual data I'm working on. But I don't see how that matters, because the problem is still occurs regardless of the number of neurons in each layer. — user3140280
How many of your 300 arrays are classified "anomaly"? How often do you present each one to your training phase? Is there a coefficient in your backprop that weights the current input in relation to what the network has already learned? As I understand it your network "forgets" the answer to the first presented training array after a couple of steps. Is that what you meant to say? — Jonas Bötel
I have the same problem. Did you get any idea about solving this problem? — Tengiz

rainman rainman · Accepted Answer · 2013-12-27T21:13:11

If you train a neural network too much, with respect to the number of iterations through the back-propagation algorithm, on one data set the weights will eventually converge to a state where it will give the best outcome for that specific training set (overtraining for machine learning). It will only learn the relationships between input and target data for that specific training set, but not the broader more general relationship that you might be looking for. It's better to merge some distinctive sets and train your network on the full set.

Without seeing the code for the back-propagation algorithm I could not give you any advice on if it's working correctly. One problem I had when implementing the back-propagation was not properly calculating the derivative of the activation function around the input value. This website was very helpful for me.

Multiple Inputs for Backpropagation Neural Network

2 Answers