1
votes

I am trying to launch a Spark cluster on an EC2 that I created in a development AWS instance. I was able to successfully connect to the EC2 instance using the AWSCLI as ec2-user. I used the existing VPC and AMI to create this EC2. Unzipped the Spark files on EC2 and using the private key tried starting the cluster using the below:

export AWS_SECRET_ACCESS_KEY=xxx

export AWS_ACCESS_KEY_ID=xxx

/home/ec2-user/spark-1.2.0/ec2$ ./spark-ec2 -k test -i /home/ec2-user/identity_files/test.pem launch test-spark-cluster

Got the Error: boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request InvalidKeyPair.NotFoundThe key pair 'test' does not existxxx

I thought, this might have been due to the region issue, so I used the region and zone parameters while launching spark

/home/ec2-user/spark-1.2.0/ec2$ ./spark-ec2 -k test -i /home/ec2-user/identity_files/test.pem -r us-west-2 -z us-west-2a launch test-spark-cluster

However, when I run this, I encounter a different error:

boto.exception.EC2ResponseError: EC2ResponseError: 400 Bad Request VPCIdNotSpecifiedNo default VPC for this userxxx

How can I resolve this issue?

1

1 Answers

3
votes

I am no expert on this area, but I would recommend setting more parameters on your script call, something like:

./spark-ec2 -k test 
            -i /home/ec2-user/identity_files/test.pem 
            -s 5 
            --instance-type=m3.medium 
            --region=eu-west-1 
            --spark-version=1.2.0 
            launch myCluster

The -s refers to the instante quantity to be created. Furthermore, you might want to check the following, pay special attention to the last one:

  • The key pair test exists on your account
  • The key pair test.pem is present on the EC2-console
  • The region for both key pair and instances is the same

Searching on the web I have found out that most of the errors related to key pairs not being found are caused by region mismatching.