I'm trying to train my own Detector model based on Tensorflow sample and this post. And I did succeed on training locally on a Macbook Pro. The problem is that I don't have a GPU and doing it on the CPU is too slow (about 25s per iteration).
This way, I'm trying to run on Google Cloud ML Engine following the tutorial, but I can't make it run properly.
My folder structures is described below:
+ data
- train.record
- test.record
+ models
+ train
+ eval
+ training
- ssd_mobilenet_v1_coco
My steps to change from local training to Google Cloud training were:
- Create a bucket in Google Cloud storage and copy my local folder structure with files;
- Edit my
pipeline.config
file and change all paths fromUsers/dev/detector/
togcc://bucketname/
; - Create a YAML file with the default configuration provided in the tutorial;
Run
gcloud ml-engine jobs submit training object_detection_
date +%s
\ --job-dir=gs://bucketname/models/train \ --packages dist/object_detection-0.1.tar.gz,slim/dist/slim-0.1.tar.gz \ --module-name object_detection.train \ --region us-east1 \ --config /Users/dev/detector/training/cloud.yml \ -- \ --train_dir=gs://bucketname/models/train \ --pipeline_config_path=gs://bucketname/data/pipeline.config
Doing so, gives me the following error message from the MLUnits:
The replica ps 0 exited with a non-zero status of 1. Termination reason: Error. Traceback (most recent call last): File "/usr/lib/python2.7/runpy.py", line 162, in _run_module_as_main "__main__", fname, loader, pkg_name) File "/usr/lib/python2.7/runpy.py", line 72, in _run_code exec code in run_globals File "/root/.local/lib/python2.7/site-packages/object_detection/train.py", line 49, in from object_detection import trainer File "/root/.local/lib/python2.7/site-packages/object_detection/trainer.py", line 27, in from object_detection.builders import preprocessor_builder File "/root/.local/lib/python2.7/site-packages/object_detection/builders/preprocessor_builder.py", line 21, in from object_detection.protos import preprocessor_pb2 File "/root/.local/lib/python2.7/site-packages/object_detection/protos/preprocessor_pb2.py", line 71, in options=None, file=DESCRIPTOR), TypeError: __new__() got an unexpected keyword argument 'file'
Thanks in advance.