2
votes

I've got a problem with implementing multilayered perceptron with Matlab Neural Networks Toolkit.

I try to implement neural network which will recognize single character stored as binary image(size 40x50). Image is transformed into a binary vector. The output is encoded in 6bits. I use simple newff function in that way (with 30 perceptrons in hidden layer):

net = newff(P, [30, 6], {'tansig' 'tansig'}, 'traingd', 'learngdm', 'mse');

Then I train my network with a dozen of characters in 3 different fonts, with following train parameters:

net.trainParam.epochs=1000000;
net.trainParam.goal = 0.00001;
net.traxinParam.lr = 0.01;

After training net recognized all characters from training sets correctly but... It cannot recognize more then twice characters from another fonts.

How could I improve that simple network?

3

3 Answers

1
votes

you can try to add random elastic distortion to your training set (in order to expand it, and making it more "generalizable").

You can see the details on this nice article from Microsoft Research : http://research.microsoft.com/pubs/68920/icdar03.pdf

1
votes

You have a very large number of input variables (2,000, if I understand your description). My first suggestion is to reduce this number if possible. Some possible techniques include: subsampling the input variables or calculating informative features (such as row and column total, which would reduce the input vector to 90 = 40 + 50)

Also, your output is coded as 6 bits, which provides 32 possible combined values, so I assume that you are using these to represent 26 letters? If so, then you may fare better with another output representation. Consider that various letters which look nothing alike will, for instance, share the value of 1 on bit 1, complicating the mapping from inputs to outputs. An output representation with 1 bit for each class would simplify things.

0
votes

You could use patternnet instead of newff, this creates a network more suitable for pattern recognition. As target function use a 26-elements vector with 1 in the right letter's position (0 elsewhere). The output of the recognition will be a vector of 26 real values between 0 and 1, with the recognized letter with the highest value.

Make sure to use data from all fonts for the training.

Give as input all data sets, train will automatically divide them into train-validation-test sets according to the specified percentages:

net.divideParam.trainRatio = .70;
net.divideParam.valRatio = .15;
net.divideParam.testRatio = .15;

(choose you own percentages).

Then test using only the test set, you can find their indices into

[net, tr] = train(net,inputs,targets);
tr.testInd