1
votes

I'm implementing Ng's example of OCR neural network in C#. I think I've got all formulas correctly implemented [vectorized version] and my app is training the network.

Any advice on how can I see my network improving in recognition - without testing examples manually by drawing them after the training is done? I want to see where my training is going while it's being trained.

I've test my trained weights on a drawn digits, output on all neurons is quite similar(approx. 0.077,or something like that ...on all neurons) ,and the largest value is on the wrong neuron. So the result doesn't match the drawn image.

This is the only test I'm doing so far: Cost Function changes with epochs enter image description here

So, this is what happens with Cost function (some call it objective function? ) in 50 epochs. my Lambda value is set to 3.0 , learning rate is 0.01, 5000 examples, I do batch after each epoch i.e. after those 5000 examples. Activation function: sigmoid.

input: 400 hidden: 25 output:10

I don't know what proper values are for lambda and learning rate so that my network can learn without overfitting or underfitting.

Any suggestions how to find out my network is learning well?

Also, what value should J cost function have after all this training? Should it approach zero?

Should I have more epochs?

Is it bad that my examples are all ordered by digits?

Any help is appreciated.

1

1 Answers

1
votes

Q: Any suggestions how to find out my network is learning well?
A: Split the data into three groups training, cross validation and test.Validate your result with test data. This is actually address in the course later.

Q: Also, what value should J cost function have after all this training? Should it approach zero?
A: I recall in the homework Ng mentioned what is the expected value. The regularized cost should not be zero since it includes a sum of all the weights.

Q: Should I have more epochs?
A: If you run your program long enough ( less than 20 minutes? ) you will see the cost is not getting smaller, I assume it reached the local/global optimum so more epochs would not be necessary.

Q: Is it bad that my examples are all ordered by digits?
A: The algorithm modify the weights for every example so different order of data does affect each step in a batch. However the final result should not have much difference.