I'm using Typesafe Config, https://github.com/typesafehub/config, to parameterize a Spark job running in yarn-cluster mode with a configuration file. The default behavior of Typesafe Config is to search the classpath for resources with names matching a regex and to load them into your configuration class automatically with ConfigFactory.load()
(for our purposes, assume the file it looks for is called application.conf
).
I am able to load the configuration file into the driver using --driver-class-path <directory containing configuration file>
, but using --conf spark.executor.extraClassPath=<directory containing configuration file>
does not put the resource on the classpath of all executors like it should. The executors report that they can not find a certain configuration setting for a key that does exist in the configuration file that I'm attempting to add to their classpaths.
What is the correct way to add a file to the classpaths of all executor JVMs using Spark?