2
votes

I am using AWS EMR cluster which i am using to do sentiment analysis on the reviews. My EMR cluster says status is "STARTING" for hours together.

I have done the following steps:

  1. Created IAM user and assigned AdministrativeAccess policy group.
  2. Created S3 buckets for holding input, logs and output
  3. Created a cluster in AWS CLI using the following command:

    aws emr create-cluster --release-label emr-4.1.0 --service-role="EMR_DefaultRole" --ec2-attributes AvailabilityZone=us-west-1a,InstanceProfile="EMR_EC2_DefaultRole" --instance-groups InstanceGroupType=MASTER,InstanceCount=1,InstanceType=m1.medium InstanceGroupType=CORE,InstanceCount=2,InstanceType=m1.medium --name "Yelp Review Sentiment Analysis Cluster" --log-uri s3://mybucket/logs/ --enable-debugging --tags Name=emr --bootstrap-actions Path=s3://mybucket/bootstrap-mrjob.sh,Name="Setup mrjob / text analytics"

My cluster is created but the status is not changing.Is there any steps i missed?

2
the emr command doesn't (and didn't in 2017) create the cluster. It "submits" the request to Provision the cluster. The cluster is "created" once you get the public IP address in the dashboard. It will take a few minutes to see even Failed status. - HoofarLotusX

2 Answers

0
votes

You may see hints in the "Events" tab of your cluster info page. Also it is worth investigating the logs (that you hopefully activated using --log-uri), they contain detailed information for the node launch (in ./node) and the bootstrap actions (in ./steps) on a per-node basis.

-1
votes

I used AWS Management Console to create AWS EMR cluster following the steps in: http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/EMR_SetUp_KeyPair.html

Check this out. It worked for me. Once the status of the cluster changes from 'STARTING' to 'WAITING', you can ssh to the master node of the cluster and perform your activities.