2
votes

I been looking around at various post about deploying SageMaker models locally, but they have to be tied to an AWS notebook instances in order to run predict/serve locally (AWS SageMaker Python SDK). This defeats the actual intent of running the Sagemaker trained model fully offline. Also there are some others who tried unpickling the tar.gz file on S3, followed by wrapping the contents to be deployed locally. However the process seems to be very restricted to certain types of models such as XGBoost and MXnet. Hence is there any way to deploy a SageMaker trained model offline without dependency to a Sagemaker notebook instance? Any form of advice would be appreciated. Thank you.

2
Are you asking with regard to SaegMaker build-in algorithms? or using your own framework (TF, Pytorch, ...) in SageMaker? - Gili Nachum
Hi Gili Nachum. I am refering to both SageMaker build-in algorithms and also own framework in Sagemaker. Could we deploy/serve the model trained on Sagemaker completely offline without any dependency to AWS after training? If not what are the limitations? - Zach
@Zach did you find any relevant solution/blog for this.? - Srini
@Srini not atm, still searching for the answer as well - Zach

2 Answers

1
votes

I've deployed PyTorch models locally via Amazon SageMaker Local Mode. I believe the same process works for other ML frameworks that have official SageMaker containers. You can run the same Docker containers locally that SageMaker uses when deploying your model on AWS infrastructure.

The docs for deploying a Sagemaker endpoint locally for inference are a bit scattered. A summary:

  1. Use local versions of API clients: normally, you use botocore.client.SageMaker and botocore.client.SageMakerRuntime classes to use SageMaker from Python. To use SageMaker locally, use sagemaker.local.LocalSagemakerClient() and sagemaker.local.LocalSagemakerRuntimeClient() instead.
  2. You can use a local tar.gz model file if you wish.
  3. Set the instance_type to local when deploying the model.

I wrote How to setup a local AWS SageMaker environment for PyTorch, which goes in detail on how this works.

0
votes

Once you have trained a model using Amazon SageMaker you'll have a Model entry. The model will point to a model artifact in S3. This tag.gz file has the model weights. The format of the file depends on the framework (tensorflow/pytorch/mxnet/...) you've used to train the model. If you've used SageMaker built-in algorithms, most of them are implemented with MXNet, or XGBoost, so you could use the relevant model serving software to run the model.
If you need serving software, you could run the SageMaker deeplearning containers in inference mode, on your local inference server. Or use open-source serving software like TFServing, or load the model in-memory.