1
votes

I developing a convolutional neural network (CNN) for image image classification.

The dataset available to me is relatively small (~35k images for both train and test sets). Each image in the dataset varies in size. The smallest image is 30 x 77 and the largest image is 1575 x 5959.

I saw this post about how to deal with images that vary in size. The post identifies the following methods for dealing with images of different sizes.

  • "Squash" images meaning they will be resized to fit specific dimensions without maintaining the aspect ratio

  • Center-crop the images to a specific size.

  • Pad the images with a solid color to a squared size, then resize.
  • Combination of the things above

These seem like reasonable suggestions, but I am unsure of which approach is most relevant for my situation where the images have significant differences in sizes. I was thinking it makes sense for me to resize the images but maintain the same aspect ratio (each image would have the same height), and then take a center crop of these images.

Does anyone else have any thoughts?

1
This is something you need to answer by experimentation, not what something we can answer.Dr. Snoopy
Not sure why this comment was appropriate. I wasn't expecting someone to do the work for me, was just hoping to understand what people have considered doing in the past or if I was missing anything to try.Zak Ray Sick
Please check what is on-topic for this site at stackoverflow.com/help/on-topic , you'll see that your question is off-topic for Stack Overflow. And in the end what I said stands, you want to know which approach is relevant, you can only get that throught experimentation.Dr. Snoopy

1 Answers

6
votes

The first important thing is: will resizing deteriorate the images?

Are your desired elements in the image all reasonably in the same scale despite the image size?

  • If yes, you should not resize, use models with variable input sizes (there is a minimum, though).
  • If no, Will resize bring your desired elements to a similar scale?
    • If yes: resize!
    • If no: better think of the other solutions

Of course you can have models that can identify your elements in many different sizes, but the bigger the differences, the more powerful the model (I believe this statement is pretty reasonable)

Keras offers you the possibility of working with different image sizes (you don't really need them to have all the same size).

For that, you just need to specify the input_shape=(None,None,input_channels).
Notice that you will need to take care of compatibilities if you're going to create and merge branches.

With varying shapes, you will not be able to use Flatten layers, though. You will need GlobalMaxPooling2D or GlobalAveragePooling2D. Some other layers are also limited to fixed sizes, but convolutional, pooling and upsampling layers are ok.

The hard part is that you can't put different sizes in a single numpy array. Then you can:

  • resize to groups of the same size without huge variations to make training easier.
  • simply not resize and train images one by one
  • keep aspect ratio and pad the sides

But the best answer depends on your tests.