0
votes

I want to implement an application, that is able to recognize pictures from camera input. I don't mean classification of objects, but rather detecting the exact single image from given set of images. So if I for example have an album with 500 pictures, then if I point a camera to one of them, then application will be able to tell it's filename. Most of tutorials I find about CoreML is strictly for image classification (recognizing class of object) and not about recognizing exact image name in camera. This needs to work from different angles as well, and all I can have for training the network is this album with many different pictures (single picture for single object). Can this be somehow achieved? I can't use ARKit Image Tracking, because there will be about 500 of these images, and I need to find at least a list of similar ones first with CoreML / Vision.

1

1 Answers

0
votes

I am not sure, but I guess perceptual hashing might be able to help you. It works in a way that it makes some fingerprint from the reference images, and for a given image, it extracts the fingerprints as well, and then you can find the most similar fingerprints.

in this way, even if the new image is not 100% as the image in the dataset, you still can detect it.

It is actually not very hard to implement. but if you would like, i think phash library is a good one to use.