Keras - Binary Classification

Question

I am currently trying to use satellite imagery to recognize Apples orchards. And I am facing a small problem in the number of representative data for each class.

In fact my question is :

Is it possible to take randomly some different images in my "not-apples" class at each epoch because I have much more of theses (compared to the "apples" one) and I want to increase the probability my network will classify out an image unrepresentative.

Thanks in advance for your help

dhinckley dhinckley · Accepted Answer · 2017-05-01T13:57:59

That is not possible in Keras. Keras will, by default, shuffle your training data and then train on it in a mini-batch fashion. However, there are still ways to re-balance your dataset.

The imbalanced training data problem that you are facing is pretty common. You have many options available to you; I list a few below:

You can adjust the relative weights of your classes using class_weight keyword of the model.fit() function.
You can "up-sample" your "apples" class or "down-sample" your "non-apples" class to have equal numbers of both classes during training.
You can generate synthetic images of your "apples" class to augment your data set. To this end, the ImageDataGenerator class in Keras can be particularly useful. This Keras tutorial is a good introduction to its usage.

In my experience, I've found #2 and #3 to be most useful. #1 is limited by the fact that the convergence of stochastic gradient descent suffers when using class weights differing by a couple orders of magnitude and smaller batch sizes.

Jason Brownlee has put together a list of tactics for dealing with imbalanced classes that might also be useful to you.

Keras - Binary Classification

1 Answers