I'm trying to train a keras model on the google cloud ml. I've followed every instruction from here: https://github.com/clintonreece/keras-cloud-ml-engine When I try to run it locally I get the ImportErrors for scikit-learn and when I try to run it on cloud, the job fails. I don't think the setup.py file is getting executed. Here's the contents of the setup.py file:
'''Cloud ML Engine package configuration.'''
from setuptools import setup, find_packages
REQUIRED_PACKAGES = ['keras',
'pandas',
'sklearn',
'numpy',
'h5py']
setup(name='iris_classifier',
version='1.0',
packages=find_packages(),
include_package_data=True,
description='IRIS classifier keras model on Cloud ML Engine',
author='Loonycorn',
author_email='[email protected]',
license='MIT',
install_requires=[REQUIRED_PACKAGES],
zip_safe=False)
Why are my packages not getting insatlled?
Here's the command for training:
gcloud ml-engine jobs submit training $JOB_NAME \
--job-dir $JOB_DIR \
--runtime-version 1.0 \
--module-name trainer.iris_classifier \
--package-path ./trainer \
--region $REGION \
-- \
--train-file gs://$BUCKET_NAME/data/iris.csv
Setup.py resides in the root directory with the data folder(which contains the csv) and the trainer folder(which contains the iris_classifier.py and init.py files).
Here's the error when the job fails:
{
insertId: "2sbguefffpjr1"
logName: "projects/loonycorn-kerasdeployment/logs/ml.googleapis.com%2Firis_classifier_train_220180511_200703"
receiveTimestamp: "2018-05-11T14:38:18.976968299Z"
resource: {…}
severity: "ERROR"
textPayload: "The replica master 0 exited with a non-zero status of 1."
timestamp: "2018-05-11T14:38:18.976968299Z"
}
I've given logs writer permission to the cloud-ml service accoutn, still this is all the logs I get.