I have followed the entire steps/format of codes(cross checked multiple times to be 100% sure they are correct) and the required data for training custom objects on Tensorflow Object Detection API. I tried using ssd_mobilenet_v1_coco, faster_rcnn_resnet101_coco as well as faster_rcnn_inception_v2_coco models and still haven't got any sort of good result. All I get is misclassification of objects or no bounding box at all.
I am training to detect a single class object with number of training images around 250 and number of validation images 63; and each image of varying size mostly around 300 x 300 pixels or lesser. I am training the models till they sort of converge(not fully). I know this by seeing the eval performance which shows at steps over 15000, the loss gradually decreases(to < 0.04) over time but also fluctuates. I stop my training and export the graph. My question is:
I have a solid doubt about the test video that I have been given to solve the object detection for. The video frames are quite large of the dimension 1370 x 786 pixels in which the object I need to detect is quite small compared to the frame size. Is this causing the problem?, since my training images are small(300 x 300 and smaller), whereas my test video frames are so large compared to the training images? I tried training several times but failed each time with each model and I am stuck to a point where I want to give up on this.
Can somebody put a light on what is happening here? Should I train for more steps? Or should I train similar dimension images as in test frames for training as well? Will this help?
Following is the code of the config file and labelmap.pbtxt I used.
Config File:
fine_tune_checkpoint: ".../ssd_mobilenet_v1_coco_2017_11_17/model.ckpt"
from_detection_checkpoint: true
num_steps: 200000
data_augmentation_options {
random_horizontal_flip {
data_augmentation_options {
ssd_random_crop {
train_input_reader: {
tf_record_input_reader {
input_path: ".../train.record"
label_map_path: ".../labelmap.pbtxt"
eval_config: {
num_examples: 63
item {
id: 1
name: 'tomato'
All I get is misclassification of objects or no bounding box at all.
– Pirate X