0
votes

I'm working on character recognition (and later fingerprint recognition) using neural networks. I'm getting confused with the sequence of events. I'm training the net with 26 letters. Later I will increase this to include 26 clean letters and 26 noisy letters. If I want to recognize one letter say "A", what is the right way to do this? Here is what I'm doing now.

1) Train network with a 26x100 matrix; each row contains a letter from segmentation of the bmp (10x10). 2) However, for the test targets I use my input matrix for "A". I had 25 rows of zeros after the first row so that my input matrix is the same size as my target matrix. 3) I run perform(net, testTargets,outputs) where outputs are the outputs from the net trained with the 26x100 matrix. testTargets is the matrix for "A".

This doesn't seem right though. Is training supposed by separate from recognizing any character? What I want to happen is as follows.

1) Training the network for an image file that I select (after processing the image into logical arrays).

2) Use this trained network to recognize letter in a different image file.

So train the network to recognize A through Z. Then pick an image, run the network to see what letters are recognized from the picked image.

1
Could you clarify your question? I'm not really sure what part of this process you're having trouble with. Could you also post some code? In general trying to implement an ml method without a strong intuitive understanding of how it works is only going to end in tears. Are you have a conceptual issue, or a bug in your code?Slater Victoroff
Posting my code will be a little difficult since I've implemented a GUI for it. As I was developing this program, I realized that I might be thinking about this wrong. I'm having issues with the whole procedure. When do I get to see if the network can recognize any input I give it? It was my understanding that I train the network for the letters A through Z, then I submit a logical array for any letters or letters from the image processing using bwlabel. My hope is to get a response from the network saying whether or not my letters were recognized.roldy

1 Answers

0
votes

Okay, so it seems that the question here seems to be more along the lines of "How do I neural networks" I can outline the basic procedure here to try to solidify the idea in your mind, but as far as actually implementing it goes you're on your own. Personally I believe that proprietary languages (MATLAB) are an abomination, but I always appreciate intellectual zeal.

The basic concept of a neural net is that you have a series of nodes in layers with weights that connect them (depending on what you want to do you can either just connect each node to the layer above and beneath, or connect every node, or anywhere in betweeen.). Each node has a "work function" or a probabilistic function that represents the chance that the given node, or neuron will evaluate to "on" or 1.

The general workflow starts from whatever top layer neurons/nodes you've got, initializing them to the values of your data (in your case, you would probably start each of these off as the pixel values in your image, normalized to be binary would be simplest). Each of those nodes would then be multiplied by a weight and fed down towards your second layer, which would be considered a "hidden layer" depending on the sum (either geometric or arithmetic sum, depending on your implementation) which would be used with the work function to determine the state of your hidden layer.

That last point was a little theoretical and hard to follow, so here's an example. Imagine your first row has three nodes ([1,0,1]), and the weights connecting the three of those nodes to the first node in your second layer are something like ([0.5, 2.0, 0.6]). If you're doing an arithmetic sum that means that the weighting on the first node in your "hidden layer" would be

1*0.5 + 0*2.0 + 1*0.6 = 1.1

If you're using a logistic function as your work function (a very common choice, though tanh is also common) this would make the chance of that node evaluating to 1 approximately 75%.

You would probably want your final layer to have 26 nodes, one for each letter, but you could add in more hidden layers to improve your model. You would assume that the letter your model predicted would be the final node with the largest weighting heading in.

After you have that up and running you want to train it though, because you probably just randomly seeded your weights, which makes sense. There are a lot of different methods for this, but I'll generally outline back-propagation which is a very common method of training neural nets. The idea is essentially, since you know which character the image should have been recognized, you compare the result to the one that your model actually predicted. If your model accurately predicted the character you're fine, you can leave the model as is, since it worked. If you predicted an incorrect character you want to go back through your neural net and increment the weights that lead from the pixel nodes you fed in to the ending node that is the character that should have been predicted. You should also decrement the weights that led to the character it incorrectly returned.

Hope that helps, let me know if you have any more questions.