Overflowing of neural network weights in training

Question

I'm training my neural network to classify some things in an image. I crop 40x40 pixels images and classify it that it as some object or not. So it has 1600 input neurons, 3 hidden layers (500, 200, 30) and 1 output neuron that must say 1 or 0. I use the Flood library.

I cannot train it with QuasiNewtonMethod, because it uses a big matrix in the algorithm and it do not fit in my memory. So I use GradientDescent and the ObjectiveFunctional is NormalizedSquaredError.

The problem is that by training it overflows the weights and the output of the neural network is INF or NaN for every input.

Also my dataset is too big (about 800mb when it is in CSV) and I can't load it fully. So I made many InputTargetDataSets with 1000 instances and saved it as XML (the default format for Flood) and training it for one epoch on each dataset randomly shuffled. But also when I train it just on one big dataset (10000 instances) it overflows.

Why is this happening and how can I prevent that?

LadyWoodi LadyWoodi · Accepted Answer · 2012-09-03T11:02:48

I would recommend normalization of inputs. You should also think about that if you have 1600 neurons..output of input layer will sum(if sigmoid neurons) and there can be many problems.

It is quite useful to print out some steps..for example in which step it overflows.

There are some tips for weights of neurons. I would recommend very small < 0.01. Maybe if you could give more info about NN and intervals of inputs, weights etc. I could give you some other ideas.

And btw I think it is mathematically proved that two layers should be enough so there is no need for three hidden layers if you are not using some specialized algorithms which simulate human eye..

Overflowing of neural network weights in training

1 Answers