I was going through this Apache Spark documentation, and it mentions that:
When running Spark on YARN in
cluster
mode, environment variables need to be set using thespark.yarn.appMasterEnv.[EnvironmentVariableName]
property in yourconf/spark-defaults.conf
file.
I am running my EMR cluster on AWS data pipeline. I wanted to know that where do I have to edit this conf file. Also, if I create my own custom conf file, and specify it as part of --configurations
(in the spark-submit), will it solve my use-case?