After creating the Amazon S3 Bucket, my_bucket
, I created an Elastic Map Reduce cluster via the cli:
aws emr create-cluster --name "Hive testing" --ami-version 3.3 --applications Name=Hive --use-default-roles --instance-type m3.xlarge --instance-count 3 --steps Type=Hive,Name="Hive Program",Args=[-d,INPUT=s3://my_bucket/input,-d.OUTPUT=s3://my_bucket/input,-d-LIBS=s3://my_bucket/serde_libs]
Note that I did not specify a hive
*.q file. After making the S3 and EMR Cluster, I will log onto the EMR box, and then run hive
interactively.
Note- I'm assuming there's an EMR box onto which I can log.
However, when I ran aws emr describe-cluster --cluster-id XYZ
, I saw this error in the output:
"State": "TERMINATED_WITH_ERRORS",
"StateChangeReason": {
"Message": "EMR service role arn:aws:iam::xyz:role/EMR_DefaultRole
is invalid",
"Code": "VALIDATION_ERROR"
}
What would cause this error? Do I need to open permissions on the S3 bucket for the EMR cluster to access it?