I am Able to train my modelusing Sagemaker TensorFlow container.
Below is the code:
model_dir = '/opt/ml/model'
train_instance_type = 'ml.c4.xlarge'
hyperparameters = {'epochs': 10, 'batch_size': 256, 'learning_rate': 0.001}
script_mode_estimator = TensorFlow(
entry_point='model.py',
train_instance_type=train_instance_type,
train_instance_count=1,
model_dir=model_dir,
hyperparameters=hyperparameters,
role=sagemaker.get_execution_role(),
base_job_name='tf-fashion-mnist',
framework_version='1.12.0',
py_version='py3',
output_path='s3://my_bucket/testing',
script_mode=True
)
Model Fitting:
script_mode_estimator.fit(inputs)
But when i ama trying to deploy model i ama getting this below error:
Deploy code is:
script_mode_d=script_mode_estimator.deploy(initial_instance_count=1,
instance_type="ml.m4.xlarge")
Error is:
UnexpectedStatusException: Error hosting endpoint tf-fashion-mnist-2020-09-23-09-05-25-791: Failed. Reason: The role 'xyz' does not have BatchGetImage permission for the image: '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow-serving:1.12-cpu'.
Please help me to resolve this issue.
xyz
and addBatchGetImage
permission to it. – Marcin