I went through the Kinect SDK and Toolkit provided by Microsoft. Tested the Face Detection Sample, it worked successfully. But, how to recognize the faces ? I know the basics of OpenCV (VS2010). Is there any Kinect Libraries for face recognition? if no, what are the possible solutions? Are there, any tutorials available for face recognition using Kinect?
3 Answers
I've been working on this myself. At first I just used the Kinect as a webcam and passed the data into a recognizer modeled after this code (which uses Emgu CV to do PCA):
http://www.codeproject.com/Articles/239849/Multiple-face-detection-and-recognition-in-real-ti
While that worked OK, I thought I could do better since the Kinect has such awesome face tracking. I ended up using the Kinect to find the face boundaries, crop it, and pass it into that library for recognition. I've cleaned up the code and put it out on github, hopefully it'll help someone else:
I've found project which could be a good source for you - http://code.google.com/p/i-recognize-you/ but unfortunetly(for you) its homepage is not in english. The most important parts:
-project(with source code) is at http://code.google.com/p/i-recognize-you/downloads/list
-in bibliography author mentioned this site - http://www.shervinemami.info/faceRecognition.html. This seems to be a good start point for you.
There are no built in functionality for the Kinect that will provide face recognition. I'm not aware of any tutorials out there that will do it, but someone I'm sure has tried. It is on my short list; hopefully time will allow soon.
I would try saving the face tracking information and doing a comparison with that for recognition. You would have a "setup" function that would ask the user the stare at the Kinect, and would save the points the face tracker returns to you. When you wish to recognize a face, the user would look at the screen and you would compare the face tracker points to a database of faces. This is roughly how the Xbox does it.
The big trick is confidence levels. Numbers will not come back exactly as they did previously, so you will need to include buffers of values for each feature -- the code would then come back with "I'm 93% sure this is Bob".