I am working on AWS SageMaker notebook example and when I play "Inference Pipeline with Scikit-learn and Linear Learner", I have an issue when it comes to fit the SKLearn model.
The code in the example is :
from sagemaker.sklearn.estimator import SKLearn
script_path = 'sklearn_abalone_featurizer.py'
sklearn_preprocessor = SKLearn(
entry_point=script_path,
role=role,
train_instance_type="ml.c4.xlarge",
sagemaker_session=sagemaker_session)
sklearn_preprocessor.fit({'train': train_input})
When I run this, I get an error :
ClientError: An error occurred (AccessDenied) when calling the CreateBucket operation: Access Denied
So I changed the sklearn_preprocessor to :
sklearn_preprocessor = SKLearn(
output_path='s3://{}/{}/model'.format(s3_bucket, prefix),
entry_point=script_path,
role=role,
train_instance_type="ml.c4.xlarge",
sagemaker_session=sagemaker_session)
Where s3_bucket is the name of my bucket and prefix is the path into it.
But still, SKLearn wants to create a bucket even if it already exists. When I fit a AWS' model using the same output_path, it works fine. Is there a way to solve this problem without changing the authorization policy ?
EDIT : I edited the role of my notebook instance and the training could run but it did create a bucket "INFO:sagemaker:Created S3 bucket: sagemaker-eu-west-1-************" in which it saved the model artifact. How can I force it to save the artifact in a given bucket.