I'm trying to run a spark application written in scala 11.8, spark 2.1 on an EMR cluster version 5.3.0. I configured the cluster with the following json:
[
{
"Classification": "hadoop-env",
"Configurations": [
{
"Classification": "export",
"Configurations": [],
"Properties": {
"JAVA_HOME": "/usr/lib/jvm/java-1.8.0"
}
}
],
"Properties": {}
},
{
"Classification": "spark-env",
"Configurations": [
{
"Classification": "export",
"Configurations": [],
"Properties": {
"JAVA_HOME": "/usr/lib/jvm/java-1.8.0"
}
}
],
"Properties": {}
}
]
if i'm trying to run on a client mode everything run just fine. when trying to run the application with cluster mode it failed with status code 12.
Here is part of the master log where I see the status code:
17/02/01 10:08:26 INFO TaskSetManager: Finished task 79.0 in stage 0.0 (TID 79) in 293 ms on ip-10-234-174-231.us-west-2.compute.internal (executor 2) (78/11102) 17/02/01 10:08:27 INFO YarnAllocator: Driver requested a total number of 19290 executor(s). 17/02/01 10:08:27 INFO ApplicationMaster: Final app status: FAILED, exitCode: 12, (reason: Exception was thrown 1 time(s) from Reporter thread.) 17/02/01 10:08:27 INFO SparkContext: Invoking stop() from shutdown hook
UPDATE:
As part of the job I need to read some data from s3,
something like this:
sc.textFile( "s3n://stambucket/impressions/*/2017-01-0[1-9]/*/impression_recdate*)
If I only take one day, there are no errors.
But with 9 I get this 12 exit code. It's even weirder consider the fact that 9 days running on client mode just fine.