I have been trying to train an object detection model for past 2 months and have finally succeeded by following this tutorial.
Here is my colab which contains all my work.
The problem is, the training loss is shown, and it is decreasing on average, but the validation loss is not.
In the pipeline.config
file, I did input the evaluation TFRecord file (which I assumed to be the validation data input) , like this:
eval_config {
metrics_set: "coco_detection_metrics"
use_moving_averages: false
}
eval_input_reader {
label_map_path: "annotations/label_map.pbtxt"
shuffle: false
num_epochs: 1
tf_record_input_reader {
input_path: "annotations/test.record"
}
}
and I read through model_main_tf2.py, which does not seem to evaluation while training, but only evaluates when the checkpoint_dir is mentioned.
Hence, I have only been able to monitor the loss on the training set and not the loss on the validation set.
As a result, I have no clue about over or under fitting.
Have any of you managed to use model_main_tf2.py successfully to view validation loss?
Also, it would be nice to see the mAP score with training.
I know keras training allows all these things to be seen on tensorboard, but OD API seems to be much harder.
Thank you for your time, if you are still confused about something please let me know.