1
votes

I'm running emr-5.3.1 with spark 2.1.0 on AWS.

When I'm submitting spark task with my JAR (FAT JAR) I'm getting the following error:

user class threw exception: java.lang.NoSuchMethodError: com.amazonaws.auth.DefaultAWSCredentialsProviderChain.getInstance()Lcom/amazonaws/auth/DefaultAWSCredentialsProviderChain;

I can only guess it is because I built my jar using a different AWS-SDK version then the one installed in the Spark 2.1.0

  1. What is the right AWS_SDK version installed on the EMR's Spark 2.1.0?
  2. Is there a way to force my submitted spark task to run with my jars?
2

2 Answers

1
votes

I'm running Spark 2.1.0 on the newest EMR image with this dependency in the POM:

        <dependency>
            <groupId>com.amazonaws</groupId>
            <artifactId>aws-java-sdk</artifactId>
            <version>1.10.75</version>
            <scope>compile</scope>
            <exclusions>
                <exclusion>
                    <artifactId>jackson-databind</artifactId>
                    <groupId>com.fasterxml.jackson.core</groupId>
                </exclusion>
                <exclusion>
                    <artifactId>jackson-dataformat-cbor</artifactId>
                    <groupId>com.fasterxml.jackson.dataformat</groupId>
                </exclusion>
            </exclusions>
        </dependency>

the way to force spark to run with your jars is to use scope "compile" and not "provided" as I did above.

BTW you can SSH to the Master of the EMR and run:

 sudo find / -name *aws-sdk*jar

I did it now and saw that the version is 1.10.77

1
votes

...Spark-submit ignores the jars submitted by the user and uses the jars under /usr/share/aws/aws-java-sdk/ which for EMR 5.4 are of version 1.10.75.1. spark-submit has a parameter which can override the server jars with the user jars, however this can cause other issues... (StayerX)

Original post: https://github.com/aws/aws-sdk-java/issues/1094