4
votes

I was trying to create a convolution neural network for the recognition of animals, vehicles, buildings, trees, plants from a large data-set having the combination of these objects.

At the time of training I got a doubt about the way in which the network should be trained. My doubt is that whether I could train the network with the data-set of whole animals as a single attribute or train each animals separately?

Means, one group for lions, one for tigers, one for elephants etc and at the time of testing I can code it to output the result as animal if any one of its subcategory is satisfied.

I got this doubt since I have read that there should be a correct pattern in the data-set for the efficient detection and there should be a pattern only if we are training with the subcategory of objects than the vast data-set.

I have attached a figure showing the sample dataset(only logically correct). I want to know whether there should be separate data-set or single data-set.

sample image

1
The answer is completely dependent on your use case - do you plan to only identify a generic label such as "animal" or labels such as "lion"/ "tiger". This is true for any algorithm you are applying for this problem i.e using CNN does not make a difference here.shekkizh
It means, the convolution neural network can find out the similarities in the data-set (even they have minimal similarity) and could recognize new data coming for testing, doesn't it?Arun Sooraj
Yes. CNN will be able to find high level features that can identify the classes with enough data and proper training - you want to define your network such that it generalizes well.shekkizh
ok... Thank you Shekkizh...Arun Sooraj

1 Answers

2
votes

Training on a separate data-set or a single data-set will depend on a variety of factors. If you want to classify the images in your test dataset using the Convolution Neural Network into just animals and not further subdivide them, then training on a single-data should be done. However, if you plan to further sub classify the images into tigers and lions, then the training needs to be done on separate datasets of tigers and lions.

The type of the dataset that you use for training will highly depend on your requirements of classification on the test dataset.

Moreover, you have to make sure that you normalize the images before you use it for training.