I have to deploy a custom keras model in AWS Sagemaker. I have a created a notebook instance and I have the following files:
AmazonSagemaker-Codeset16
-ann
-nginx.conf
-predictor.py
-serve
-train.py
-wsgi.py
-Dockerfile
I now open the AWS terminal and build the docker image and push the image in the ECR repository. Then I open a new jupyter python notebook and try to fit the model and deploy the same. The training is done correctly but while deploying I get the following error:
"Error hosting endpoint sagemaker-example-2019-10-25-06-11-22-366: Failed. >Reason: The primary container for production variant AllTraffic did not pass >the ping health check. Please check CloudWatch logs for this endpoint..."
When I check the logs, I find the following:
2019/11/11 11:53:32 [crit] 19#19: *3 connect() to unix:/tmp/gunicorn.sock >failed (2: No such file or directory) while connecting to upstream, client: >10.32.0.4, server: , request: "GET /ping HTTP/1.1", upstream: >"http://unix:/tmp/gunicorn.sock:/ping", host: "model.aws.local:8080"
and
Traceback (most recent call last): File "/usr/local/bin/serve", line 8, in sys.exit(main()) File "/usr/local/lib/python2.7/dist->packages/sagemaker_containers/cli/serve.py", line 19, in main server.start(env.ServingEnv().framework_module) File "/usr/local/lib/python2.7/dist->packages/sagemaker_containers/_server.py", line 107, in start module_app, File "/usr/lib/python2.7/subprocess.py", line 711, in init errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1343, in _execute_child raise child_exception
I tried to deploy the same model in AWS Sagemaker with these files in my local computer and the model was deployed successfully but inside AWS, I am facing this problem.
Here is my serve file code:
from __future__ import print_function
import multiprocessing
import os
import signal
import subprocess
import sys
cpu_count = multiprocessing.cpu_count()
model_server_timeout = os.environ.get('MODEL_SERVER_TIMEOUT', 60)
model_server_workers = int(os.environ.get('MODEL_SERVER_WORKERS', cpu_count))
def sigterm_handler(nginx_pid, gunicorn_pid):
try:
os.kill(nginx_pid, signal.SIGQUIT)
except OSError:
pass
try:
os.kill(gunicorn_pid, signal.SIGTERM)
except OSError:
pass
sys.exit(0)
def start_server():
print('Starting the inference server with {} workers.'.format(model_server_workers))
# link the log streams to stdout/err so they will be logged to the container logs
subprocess.check_call(['ln', '-sf', '/dev/stdout', '/var/log/nginx/access.log'])
subprocess.check_call(['ln', '-sf', '/dev/stderr', '/var/log/nginx/error.log'])
nginx = subprocess.Popen(['nginx', '-c', '/opt/ml/code/nginx.conf'])
gunicorn = subprocess.Popen(['gunicorn',
'--timeout', str(model_server_timeout),
'-b', 'unix:/tmp/gunicorn.sock',
'-w', str(model_server_workers),
'wsgi:app'])
signal.signal(signal.SIGTERM, lambda a, b: sigterm_handler(nginx.pid, gunicorn.pid))
# If either subprocess exits, so do we.
pids = set([nginx.pid, gunicorn.pid])
while True:
pid, _ = os.wait()
if pid in pids:
break
sigterm_handler(nginx.pid, gunicorn.pid)
print('Inference server exiting')
# The main routine just invokes the start function.
if __name__ == '__main__':
start_server()
I deploy the model using the following:
predictor = classifier.deploy(1, 'ml.t2.medium', serializer=csv_serializer)
Kindly let me know the mistake I am doing while deploying.