How do I deploy a ML model trained on SageMaker, to a local machine to run predict?

Question

I been looking around at various post about deploying SageMaker models locally, but they have to be tied to an AWS notebook instances in order to run predict/serve locally (AWS SageMaker Python SDK). This defeats the actual intent of running the Sagemaker trained model fully offline. Also there are some others who tried unpickling the tar.gz file on S3, followed by wrapping the contents to be deployed locally. However the process seems to be very restricted to certain types of models such as XGBoost and MXnet. Hence is there any way to deploy a SageMaker trained model offline without dependency to a Sagemaker notebook instance? Any form of advice would be appreciated. Thank you.

Are you asking with regard to SaegMaker build-in algorithms? or using your own framework (TF, Pytorch, ...) in SageMaker? — Gili Nachum
Hi Gili Nachum. I am refering to both SageMaker build-in algorithms and also own framework in Sagemaker. Could we deploy/serve the model trained on Sagemaker completely offline without any dependency to AWS after training? If not what are the limitations? — Zach

Derek Haynes Derek Haynes · Accepted Answer · 2020-04-03T03:34:29

I've deployed PyTorch models locally via Amazon SageMaker Local Mode. I believe the same process works for other ML frameworks that have official SageMaker containers. You can run the same Docker containers locally that SageMaker uses when deploying your model on AWS infrastructure.

The docs for deploying a Sagemaker endpoint locally for inference are a bit scattered. A summary:

Use local versions of API clients: normally, you use botocore.client.SageMaker and botocore.client.SageMakerRuntime classes to use SageMaker from Python. To use SageMaker locally, use sagemaker.local.LocalSagemakerClient() and sagemaker.local.LocalSagemakerRuntimeClient() instead.
You can use a local tar.gz model file if you wish.
Set the instance_type to local when deploying the model.

I wrote How to setup a local AWS SageMaker environment for PyTorch, which goes in detail on how this works.

How do I deploy a ML model trained on SageMaker, to a local machine to run predict?

2 Answers