mAP decreasing with training tensorflow object detection SSD

Question

I'm trying to train a SSD mobilenet detector for detecting cell nuclei in microscopic images. I'm using the tensorflow object detection API on Ubuntu 16.04 with the GPU implementation of tensorflow (version 1.4). My input images are 256x256 RGB jpg tiles with annotated cell nuclei.

When I start training I see a nice increase in mAP and at about 6k global steps (batch size 12) I can detect most cell nuclei, but with some multiple detections of the same cell nuclei.

Weirdly, after this point mAP starts decreasing and the model detects less and less cell nuclei even though the TotalLoss continues to decrease. At 100k steps, almost no nuclei are detected.

I use the standard config file for SSD except that I've decreased the cutoff for matched/unmatched boxes. If I don't use this modification the model has difficulties detecting any cell nuclei because they are smallish objects and too few boxes overlap them.

matcher {
  argmax_matcher {
    matched_threshold: 0.3
    unmatched_threshold: 0.3
    ignore_thresholds: false
    negatives_lower_than_unmatched: true
    force_match_for_each_row: true
  }

Why is it that mAP and detection accuracy decrease over time even though TotalLoss improves? My intuition of the results is that the detection model is getting more and more accurate (never a false positive) but less and less sensitive (lots of false negatives).

Any suggestions greatly appreciated!

(here are some example images from tensorboard)

frostell frostell · Accepted Answer · 2018-02-17T22:20:56

ok, so after some experimentation (=blind guessing) with the config file I think I found the answer to my question - I'm putting it here hoping that someone else can benefit.

First, the reason for mAP decreasing was probably the setting:

matched_threshold: 0.3
unmatched_threshold: 0.3

From my experimentation, lowering this setting (like I did) below 0.5 seems to destabilise the model and make it break during training (with decreasing mAP over time).

Second, when trying to detect cell nuclei in a microscopic image (this probably applies to other small objects with a known size as well), the SSD seems to be VERY sensitive to the min/max-setting in the anchor generator.

anchor_generator {
  ssd_anchor_generator {
    num_layers: 6
    min_scale: 0.2
    max_scale: 0.95

When I started out (and constantly failed) I used a ball park estimate for this setting, and when playing around with different image sizes and such, suddenly at 128x128 pixels the model got really good with mAP 0.9 detecting more or less every cell. When trying to figure out why it suddenly worked I printed histograms over the relative sizes of the annotated objects in the images I realised that I got lucky with the 128x128 models config file and hit the range precisely.

I then went back to all the other models and sizes, and when using the exact range of sizes of the cell nuclei in a certain image size, the model performs perfect, even at larger images sizes (e.g. 512px) where the nuclei only take up 3-15% of the image width. Even at 1024px with downsampling to 512 and the nuclei only covering 1-7% of image width, the model performs ok as long as the size range is precisely specified.

For my application this is actually not a problem since I know beforehand what sizes of features to expect, but with a more general problem I'm guessing it's a weakness..

mAP decreasing with training tensorflow object detection SSD

1 Answers