0
votes

Let's say we have n images of cat and dog separately and we trained an image classification model to classify a new image with a probability score saying whether it's a cat or a dog.

Now, we are getting images that contains multiple cats and dogs in same image, how can we detect and localize objects(cats and dogs here)?

If it is possible, can we also depict the focus areas considered by model for prediction so that a bounding box could be drawn?

2
i'd say with a training methodology that you have mentioned it would be rather difficult. The model takes the whole image and puts it into one bin. Multi-class classification for a single image would not be allowed by your model.learner
One way I can think of this being achievable is you train another model to extract animals from an image and then feed this to the trained model above. Suppose there are two cats and three dogs in an image, so for that image, you'll have 5 sub-images. These can then be passed through to the trained network and you can make a combined estimate laterlearner
I am trying this approach, what am i trying is to have the earlier dataset as it is with drawing a boundary box around complete image(containing single animal). Let me tell you animal is just an instance i have taken, i do have some other application. I am saving it's xml and also taking new data containing multiple animals and annotating them. I will try mix and feed approach with a object detection technique. Any suggestion here would be most welcome..Ashutosh Srivastava
Once you have the bounding box, you can use cv or any other tool box to extract pixels from the image and form sub-images. Pass these sub-images to the next network and until all the sub-images are fed to the second network, put all the predictions into one array. Finally annotate on the original image using one-one correspondence of the image and the bounding box location. Good luck. Also mention if I can write this as an answer.learner

2 Answers

0
votes

I think you have trouble understanding how a basic object detection works. I will recommend you to read this paper first:

https://arxiv.org/pdf/1807.05511.pdf

0
votes

It is possible. You can use Yolo like this example. There is a keras classification based solution.