How can I figure out what exactly happens after you deploy your EMR Step to cluster with master equals to local[x]?
How command-runner.jar submit job to EMR's master? If I pass "--executor-cores 4" as spark-submit argument, but at Launcher I create session with local[8] how much cores I will get for executor? How muh executors it'll create?
I failed to find this out at AWS documentation. Example:
SomeStep:
Type: AWS::EMR::Step
Properties:
ActionOnFailure: CONTINUE
HadoopJarStep:
Args:
- "spark-submit"
- "--deploy-mode"
- "cluster"
- "--executor-cores"
- "4"
- "--class"
- "com.psyquation.batch.analytic.Driver"
{
"Fn::Sub": "s3://some-bucker/my-app.jar"
},
Jar: command-runner.jar
MainClass: com.somepackage.Launcher
Name: SomeStep
JobFlowId: !Ref SomeCluster
And now inside com.somepackage.Launcher:
SparkSession.Builder builder = SparkSession.builder()
// some configs ...
.master("local[8]")
.getOrCreate();