Object detection in 1080p with SSD Mobilenet (Tensorflow API)

Question

Hello everybody,

My objective is to detect people and cars (day and night) on images of the size of 1920x1080, for this I use the tensorflow API, I use a SSD mobilenet model, I annotated 1000 images (900 for training, 100 for evaluation) from 7 different cameras. I launch the training with an image size of 960x540. My model does not converge. I do not know what to do, should I make different classes for day and night objects?

On a tutorial for face detection with the tensorflow API, they use a dataset with images containing only faces, then use the model on complex scenes. Is this a good idea knowing that a model like SSD also learns negative examples?

Thank you

(sources: https://blog.usejournal.com/face-detection-for-cctv-surveillance-6b8851ca3751)

Carlo Carlo · Accepted Answer · 2019-11-04T09:36:34

What do you mean by "not converge"? Are you referring to the train/validation loss?
In this case, the first thing that comes to my mind is to reduce the learning rate (I had a similar problem). You can do it by modifying you configuration file, in the "train_config" section you'll find the value "initial_learning_rate".
Try to set it up to a lower value (like, an order of magnitude lower) and see if it helps.

Object detection in 1080p with SSD Mobilenet (Tensorflow API)

1 Answers