9
votes

I am trying to train Tensorflow Object Detection API on my dataset containing apples and capsicum. For that, I generated the required files (TFrecords and images with annotations) and placed them in models/research/object_detection directory. Then, I forked the Object detection api from github and pushed my files to my forked repo. Then, I clone this repo inside Google Collaboratory and run the train.py file but I get the DuplicateFlagError:master error.

---------------------------------------------------------------------------

DuplicateFlagError               Traceback (most recent call last)
/content/models/research/object_detection/train.py in <module>()
     56 
     57 flags = tf.app.flags
---> 58 flags.DEFINE_string('master', '', 'Name of the TensorFlow master to use.')
     59 flags.DEFINE_integer('task', 0, 'task id')
     60 flags.DEFINE_integer('num_clones', 1, 'Number of clones to deploy per worker.')

/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/flags.py in wrapper(*args, **kwargs)
     56           'Use of the keyword argument names (flag_name, default_value, '
     57           'docstring) is deprecated, please use (name, default, help) instead.')
---> 58     return original_function(*args, **kwargs)
     59 
     60   return tf_decorator.make_decorator(original_function, wrapper)

/usr/local/lib/python3.6/dist-packages/absl/flags/_defines.py in DEFINE_string(name, default, help, flag_values, **args)
    239   parser = _argument_parser.ArgumentParser()
    240   serializer = _argument_parser.ArgumentSerializer()
--> 241   DEFINE(parser, name, default, help, flag_values, serializer, **args)
    242 
    243 

/usr/local/lib/python3.6/dist-packages/absl/flags/_defines.py in DEFINE(parser, name, default, help, flag_values, serializer, module_name, **args)
     80   """
     81   DEFINE_flag(_flag.Flag(parser, serializer, name, default, help, **args),
---> 82               flag_values, module_name)
     83 
     84 

/usr/local/lib/python3.6/dist-packages/absl/flags/_defines.py in DEFINE_flag(flag, flag_values, module_name)
    102   # Copying the reference to flag_values prevents pychecker warnings.
    103   fv = flag_values
--> 104   fv[flag.name] = flag
    105   # Tell flag_values who's defining the flag.
    106   if module_name:

/usr/local/lib/python3.6/dist-packages/absl/flags/_flagvalues.py in __setitem__(self, name, flag)
    425         # module is simply being imported a subsequent time.
    426         return
--> 427       raise _exceptions.DuplicateFlagError.from_flag(name, self)
    428     short_name = flag.short_name
    429     # If a new flag overrides an old one, we need to cleanup the old flag's

DuplicateFlagError: The flag 'master' is defined twice. First from object_detection/train.py, Second from object_detection/train.py.  Description from first occurrence: Name of the TensorFlow master to use.

To solve it, I tried to comment that line, but then I got DuplicateFlagError on next flag i.e on next line. So, to try to solve the issue, I commented all the lines in train.py that declared those flags i.e I commented from line 58 to line 82. But then, I got the error NotFoundError: ;

---------------------------------------------------------------------------
NotFoundError                             Traceback (most recent call last)
/content/models/research/object_detection/train.py in <module>()
    165 
    166 if __name__ == '__main__':
--> 167   tf.app.run()

/usr/local/lib/python3.6/dist-packages/tensorflow/python/platform/app.py in run(main, argv)
    124   # Call the main function, passing through any arguments
    125   # to the final program.
--> 126   _sys.exit(main(argv))
    127 
    128 

/content/models/research/object_detection/train.py in main(_)
    105                            ('input.config', FLAGS.input_config_path)]:
    106         tf.gfile.Copy(config, os.path.join(FLAGS.train_dir, name),
--> 107                       overwrite=True)
    108 
    109   model_config = configs['model']

/usr/local/lib/python3.6/dist-packages/tensorflow/python/lib/io/file_io.py in copy(oldpath, newpath, overwrite)
    390   with errors.raise_exception_on_not_ok_status() as status:
    391     pywrap_tensorflow.CopyFile(
--> 392         compat.as_bytes(oldpath), compat.as_bytes(newpath), overwrite, status)
    393 
    394 

/usr/local/lib/python3.6/dist-packages/tensorflow/python/framework/errors_impl.py in __exit__(self, type_arg, value_arg, traceback_arg)
    514             None, None,
    515             compat.as_text(c_api.TF_Message(self.status.status)),
--> 516             c_api.TF_GetCode(self.status.status))
    517     # Delete the underlying status object from memory otherwise it stays alive
    518     # as there is a reference to status from this from the traceback due to

NotFoundError: ; No such file or directory

How should I solve it? This is my Collab notebook - https://drive.google.com/file/d/1mZGOKX3JZXyG4XYkI6WHIXoNbRSpkE_F/view?usp=sharing

2

2 Answers

27
votes
####Delete all flags before declare#####

def del_all_flags(FLAGS):
    flags_dict = FLAGS._flags()    
    keys_list = [keys for keys in flags_dict]    
    for keys in keys_list:
        FLAGS.__delattr__(keys)

del_all_flags(tf.flags.FLAGS)
0
votes

After going through your colab notebook and your modified fork from tensorflow/models Github repository, here's how I got it working on my local machine.

I got the latest tensorflow version i.e. 1.6 which is same as that on Google Colab.

  1. The path specified by you in ssd_mobilenet_v1_coco.config is data/object-detection.pbtxt. So execute train.py from models/research/object_detection directory.

  2. train.py expects --pipeline_config_path as the parameter but you have specified --pipeline_config. So, if you go through train.py code you will realise that if --pipeline_config_path is not specified then it defaults the config file name as models.config and hence you get NotFoundError: ; No such file or directory

So the final command should be like this:

ubuntu@Himanshu:~/Desktop/models/research/object_detection$ python train.py --logtostderr --train_dir=training --pipeline_config_path=training/ssd_mobilenet_v1_coco.config
  1. Good that I installed Tensorflow 1.6, I got the same error as mentioned here: init() got an unexpected keyword argument 'dct_method'

As the comment in the above link suggests: Remove dct_method=dct_method in object_detection/data_decoders/tf_example_decoder.py around line 109.

Hope this helps.