I'm developing a gesture recognition program for a phone. What I am trying to accomplish is for users to draw their own "patterns", and then afterwards have these patterns do different things.
Storing the pattern - the "pattern save" algorithm as I call it
This when the gesture is originally being drawn and recorded. This is also the algorithm I use for grabbing what the user draws, and to use it for comparison:
- The user starts drawing his pattern. For every 15 pixels, a point is placed in a list referred to as "the list".
- Once the pattern has been drawn, the first and last point is removed from the list.
- For each of the points now in the list, their connections are converted into a direction enumeration (containing 8 directions) which is then added to a list as well, now referred to as "the list".
- Filter 1 begins, going through 3 directions at a time in the list. If the left direction is the same as the right direction, the middle direction is removed.
- Filter 2 begins, removing duplicate directions.
- Filter 3 begins, removing assumed noise. Assumed noise is detected by pairs of duplicate directions occuring again and again. (as an example, "left upper-left left upper-left" is being turned into "upper-left" or "left").
- Filter 4 begins, removing even more assumed noise. Assumed noise is this time detected by (again) comparing 3 directions at a time in the list as seen in step 4 (Filter 1), but where directions are not checked for being entirely equal, only almost equal (as an example, left is almost equal to "upper-left" and "lower-left").
The list of directions are now stored in a file. The directions list is saved as the gesture itself, used for comparing it later.
Comparing the pattern
Once a user then draws a pattern, the "pattern save" algorithm is used on that pattern as well (but only to filter out noise, not actually saving it, since that would be stupid).
This filtered pattern is then compared with all current patterns in the gesture list. This comparison method is quite complex to describe, and I'm not that good at English as I should be.
In short, it goes through the gesture that the user typed in, and for each direction in this gesture, compares with all other gestures directions. If a direction is similar (as seen in the algorithm above), that's okay, and it continues to check the next direction. If it's not similar 2 times in a row, it is considered a non-match.
Conclusion
All of this is developed by myself, since I love doing what I do. I'd love to hear if there are anywhere on the Internet where I can find resources on something similar to what I am doing.
I do not want any neural network solutions. I want it to be "under control" so to speak, without any training needed.
Some feedback would be great too, and would work as well, if you have any way that I could do the above algorithm better.
You see, it works fine in some scenarios. But for instance, when I make an "M" and an upside-down "V", it can't recognize the difference.
Help would be appreciated. Oh, and vote up the question if you think I described everything well!