2
votes

I am training a classifier for recognizing certain objects in an image. I am using the Watson Visual Recognition API but I would assume that the same question applies to other recognition APIs as well.

I've collected 400 pictures of something - e.g. dogs.

Before I train Watson, I can delete pictures that may throw things off. Should I delete pictures of:

  1. Multiple dogs
  2. A dog with another animal
  3. A dog with a person
  4. A partially obscured dog
  5. A dog wearing glasses

Also, would dogs on a white background make for better training samples?

Watson also takes negative examples. Would cats and other small animals be good negative examples? What else?

1

1 Answers

5
votes

You are right that this is a general issue for all kinds of custom classifiers and recognizers - be it vize.it, clarifai, IBM Watson, or training a neural network on your own say in caffe. (Sorted by the number of example images you need to use.)

The important thing you need to ask is how are you going to use the classifier? What are the real images you will feed the machine to predict the objects shown? As a general rule, your training images should be as similar to predict-time images as possible - both in what they depict (kinds and variety of objects) and how they depict it (e.g. backgrounds). Neural networks are super-powerful and if you feed them enough images, they will learn even the hard cases.

Maybe you want to find dog images in user's folders - which will include family photos, screenshots and document scans. Reflect that variety in the training set. Ask the user if a dog with another animal should be tagged as a dog photo.

Maybe you want to find dog images on a wilderness photo trap. Just use various images taken by that photo trap (or several photo traps, if it's a whole network).

In short — tailor your sample images to the task at hand!