I am building a time series usecase to automate the preprocess and retrain tasks.At first the data is preprocessed using numpy, pandas, statsmodels etc & later a machine learning algorithm is applied to make predictions. The reason for using inference pipeline is that it reuses the same preprocess code for training and inference. I have checked the examples given by AWS sagemaker team with spark and sci-kit learn. In both the examples they use a sci-kit learn container to fit & transform their preprocess code. Should I also have to create a container which is not needed in my use case as I am not using any sci-kit-learn code?
Can someone give me a custom example of using these pipelines? Any help is appreciated!
Sources looked into:
https://github.com/awslabs/amazon-sagemaker-examples/tree/master/sagemaker-python-sdk/scikit_learn_inference_pipeline https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/inference_pipeline_sparkml_blazingtext_dbpedia