2
votes

I cloned tensorflow object detection model on githug: github link
And I want to train this model with my own data (331 samoyed dog's images) following by this blog tutorial click here

enter image description here

My steps:

  1. Created PASCAL VOC format dataset;
  2. download retrained model(ssd_mobilenet_v1_coco_11_06_2017.tar.gz)
  3. change the config file(ssd_mobilenet_v1_pets.config)
  4. initial the training process by this codes:

    python object_detection/train.py \ --logtostderr \ --pipeline_config_path=./samoyed_test_and_train/training/ssd_mobilenet_v1_pets.config \ --train_dir=./samoyed_test_and_train/data/train.record


but I receive errors, my os is MacOS,and I tried on AWS,same problem occurs, can you figured out my mistakes ?errors:

INFO:tensorflow:Summary name Learning Rate is illegal; using Learning_Rate instead.
WARNING:tensorflow:From /Users/zhaoenpei/Desktop/dabai-robot-arm/experiments/models/object_detection/meta_architectures/ssd_meta_arch.py:579: all_variables (from tensorflow.python.ops.variables) is deprecated and will be removed after 2017-03-02.
Instructions for updating:
Please use tf.global_variables instead.
INFO:tensorflow:Summary name /clone_loss is illegal; using clone_loss instead.
2017-08-01 10:34:42.992224: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 10:34:42.992254: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-08-01 10:35:00.359032: I tensorflow/core/common_runtime/simple_placer.cc:675] Ignoring device specification /device:GPU:0 for node 'prefetch_queue_Dequeue' because the input edge from 'prefetch_queue' is a reference connection and already has a device field set to /device:CPU:0
INFO:tensorflow:Restoring parameters from /Users/zhaoenpei/Desktop/dabai-robot-arm/experiments/models/samoyed_test_and_train/training/model.ckpt
INFO:tensorflow:Error reported to Coordinator: <class 'tensorflow.python.framework.errors_impl.FailedPreconditionError'>, ./samoyed_test_and_train/data/train.record/graph.pbtxt.tmpf4587d1958df43cbaa9a0d7a04199f6f
2017-08-01 10:35:29.556458: E tensorflow/core/util/events_writer.cc:62] Could not open events file: ./samoyed_test_and_train/data/train.record/events.out.tfevents.1501554929.MacBook-Pro.local: Failed precondition: ./samoyed_test_and_train/data/train.record/events.out.tfevents.1501554929.MacBook-Pro.local
2017-08-01 10:35:29.556480: E tensorflow/core/util/events_writer.cc:95] Write failed because file could not be opened.
Traceback (most recent call last):
  File "object_detection/train.py", line 198, in <module>
    tf.app.run()
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "object_detection/train.py", line 194, in main
    worker_job_name, is_chief, FLAGS.train_dir)
  File "/Users/zhaoenpei/Desktop/dabai-robot-arm/experiments/models/object_detection/trainer.py", line 290, in train
    saver=saver)
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/contrib/slim/python/slim/learning.py", line 732, in train
    master, start_standard_services=False, config=session_config) as sess:
  File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 964, in managed_session
    self.stop(close_summary_writer=close_summary_writer)
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 792, in stop
    stop_grace_period_secs=self._stop_grace_secs)
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/coordinator.py", line 389, in join
    six.reraise(*self._exc_info_to_raise)
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 953, in managed_session
    start_standard_services=start_standard_services)
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 709, in prepare_or_wait_for_session
    self._write_graph()
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/training/supervisor.py", line 612, in _write_graph
    self._logdir, "graph.pbtxt")
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/framework/graph_io.py", line 67, in write_graph
    file_io.atomic_write_string_to_file(path, str(graph_def))
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 418, in atomic_write_string_to_file
    write_string_to_file(temp_pathname, contents)
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 305, in write_string_to_file
    f.write(file_content)
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 101, in write
    self._prewrite_check()
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/lib/io/file_io.py", line 87, in _prewrite_check
    compat.as_bytes(self.__name), compat.as_bytes(self.__mode), status)
  File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/Users/zhaoenpei/.virtualenvs/python_virtual_1/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.FailedPreconditionError: ./samoyed_test_and_train/data/train.record/graph.pbtxt.tmpf4587d1958df43cbaa9a0d7a04199f6f
1

1 Answers

2
votes

the train_dir flag is meant to point at some (typically empty) directory where your training logs and checkpoints will be written during training. For example it could be something like train_dir=/tmp/training_directory. It looks like you are trying to point it at your dataset --- which the config file should already be pointing at.