I've got a 40k image dataset of images from four different countries. The images contain diverse subjects: outdoor scenes, city scenes, menus, etc. I wanted to use deep learning to geotag images.
I started with a small network of 3 conv->relu->pool layers and then added 3 more to deepen the network since the learning task is not straightforward.
My loss is doing this (with both the 3 and 6 layer networks)::
The loss actually starts kind of smooth and declines for a few hundred steps, but then starts creeping up.
What are the possible explanations for my loss increasing like this?
My initial learning rate is set very low: 1e-6, but I've tried 1e-3|4|5 as well. I have sanity-checked the network design on a tiny-dataset of two classes with class-distinct subject matter and the loss continually declines as desired. Train accuracy hovers at ~40%