0
votes

I'm trying to run some training on the Sagemaker using Random-Forest and its giving me this validation error. I'm not sure if I need to adjust hyper-parameters. I tried but still an error. Here is the full text of the error.

"Failure reason ClientError: Unable to initialize the algorithm. Failed to validate input data configuration. (caused by ValidationError) Caused by: u'FullyReplicated' is not one of [u'ShardedByS3Key'] Failed validating u'enum' in schema[u'properties'][u'train'][u'properties'][u'S3DistributionType']: {u'enum': [u'ShardedByS3Key'], u'type': u'string'} On instance[u'train'][u'S3DistributionType']: u'FullyReplicated'"

I have tried different parameters - but I still get the same results.

1

1 Answers

0
votes

It seems that your input specification is not correct. See the SageMaker RCF input documentation here https://docs.aws.amazon.com/sagemaker/latest/dg/randomcutforest.html#rcf-input_output and an example of Random Cut Forest training instantiation with the Python SDK https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_amazon_algorithms/random_cut_forest/random_cut_forest.ipynb. Also, note that SageMaker does not have a built-in random forest. The SageMaker Random Cut Forest is different from a random forest and does unsupervised anomaly detection.