1
votes

I'm trying to deploy an SKlearn model on Amazon Sagemaker, and am working through the example provided in their documentation and am getting the above error when I deploy the model.

I'm following the instructions provided in this notebook, and so far have just copied and pasted the code that they have.

Right now, this is the exact code I have in my jupyter notebook:

# S3 prefix
prefix = 'Scikit-iris'

import sagemaker
from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()

# Get a SageMaker-compatible role used by this Notebook Instance.
role = get_execution_role()

import numpy as np
import os
from sklearn import datasets

# Load Iris dataset, then join labels and features
iris = datasets.load_iris()
joined_iris = np.insert(iris.data, 0, iris.target, axis=1)

# Create directory and write csv
os.makedirs('./iris', exist_ok=True)
np.savetxt('./iris/iris.csv', joined_iris, delimiter=',', fmt='%1.1f, %1.3f, 
%1.3f, %1.3f, %1.3f')

WORK_DIRECTORY = 'data'

train_input = sagemaker_session.upload_data(WORK_DIRECTORY, key_prefix="{}/{}".format(prefix, WORK_DIRECTORY) )

from sagemaker.sklearn.estimator import SKLearn

script_path = 'scikit_learn_iris.py'

sklearn = SKLearn(
  entry_point=script_path,
  train_instance_type="ml.c4.xlarge",
  role=role,
  sagemaker_session=sagemaker_session,
  framework_version='0.20.0',
  hyperparameters={'max_leaf_nodes': 30})

sklearn.fit({'train': train_input})

sklearn.deploy(instance_type='ml.m4.xlarge',
                                 initial_instance_count=1)

And at that point I get the error message.

The contents of 'scikit_learn_iris.py' look like this:

import argparse
import pandas as pd
import os
import numpy as np

from sklearn import tree
from sklearn.externals import joblib

if __name__ == '__main__':
    parser = argparse.ArgumentParser()

# Hyperparameters are described here. In this simple example we are just including one hyperparameter.
parser.add_argument('--max_leaf_nodes', type=int, default=-1)

# SageMaker specific arguments. Defaults are set in the environment variables.
parser.add_argument('--output-data-dir', type=str, default=os.environ['SM_OUTPUT_DATA_DIR'])
parser.add_argument('--model-dir', type=str, default=os.environ['SM_MODEL_DIR'])
parser.add_argument('--train', type=str, default=os.environ['SM_CHANNEL_TRAIN'])

args = parser.parse_args()

# Take the set of files and read them all into a single pandas dataframe
input_files = [ os.path.join(args.train, file) for file in os.listdir(args.train) ]
if len(input_files) == 0:
    raise ValueError(('There are no files in {}.\n' +
                      'This usually indicates that the channel ({}) was incorrectly specified,\n' +
                      'the data specification in S3 was incorrectly specified or the role specified\n' +
                      'does not have permission to access the data.').format(args.train, "train"))
raw_data = [ pd.read_csv(file, header=None, engine="python") for file in input_files ]
train_data = pd.concat(raw_data)

# labels are in the first column
train_y = train_data.ix[:,0].astype(np.int)
train_X = train_data.ix[:,1:]

# We determine the number of leaf nodes using the hyper-parameter above.
max_leaf_nodes = args.max_leaf_nodes

# Now use scikit-learn's decision tree classifier to train the model.
clf = tree.DecisionTreeClassifier(max_leaf_nodes=max_leaf_nodes)
clf = clf.fit(train_X, train_y)

# Save the decision tree model.
joblib.dump(clf, os.path.join(args.model_dir, "model.joblib"))

My cloudwatch logs look like this:

enter image description here

1

1 Answers

1
votes

Based on the error from the CloudWatch logs, the script is missing the model_fn definition as provided in the notebook. I've repeated the function here for convenience:

def model_fn(model_dir):
    return joblib.load(os.path.join(model_dir, "model.joblib"))

Try appending this to the bottom of your script and re-running your notebook.