2
votes

I have problem with object detection api, my loss is very low from start of the training:

INFO:tensorflow:global step 3: loss = 1.6555 (1.949 sec/step)

INFO:tensorflow:global step 4: loss = 1.1560 (2.021 sec/step)

INFO:tensorflow:global step 5: loss = 1.7363 (2.201 sec/step)

The mAP after few thousand steps is around 0.25... Does tensorflow save some additional "checkpoint" anywhere else instead training folder?? It seems like the model is using weights from previously trained network,even if have paths in config right to fresh checkpoint and everything is in different folders :(

I was following this blog tutorial [https://medium.com/towards-data-science/how-to-train-your-own-object-detector-with-tensorflows-object-detector-api-bec72ecfe1d9] on how to train a model in object detection api. I was using my own dataset with one class (just few images to test its working) and own config(faster rcnn resnet101). It did pretty well, model was able to detect class on new images. Then I try to train with 4 classes, a bit more images - worked nice.

But now I want to train on large set of 400 images, first I tried 4classes and problem with low loss appeared, so I tried with just one classes (I even labeled my dataset again to have only one class) same again. Now it is not working even with previous config on few images which I was training from begging.

Can someone help please?? Since now even the 5 images config which worked before is not working aswell, I dont think its related to bad dataset or something.. :(

1

1 Answers

1
votes

While this is not the answer I can comfort you by saying your're not alone. I'm using the same model and experiencing the low loss at the start as aswell. My initial training was a total failure so I decided to sort out my dataset properly and retrain the model using higher accuracy pre-set (previously I used mobile net) and thus I met the low initial loss problem.

So far I'll just roll with it and let you know whether it works as intented, maybe it is supposed to be that way(doubt it very much).

EDIT - As someone mentioned I didn't post an answer to the question so here is my update on the matter. I trained quite a few models with very low starting loss and as the training went on the loss reached something like 0,002 on average. What I found out (at least in my case) that the longer you train such model, the worse result it will yield. My best performer was frozen at like 2,5k steps, while the ones that went on for 10k+ were extremely prone to false positives. I guess if you encounter a very low starting loss you shouldn't let it run for a long time. Can't explain how this works since I'm quite new to machine learning but this is coming from my personal experience. Hope this helps at least a little bit