I have Spark 1.6.1 and I have set
export HADOOP_CONF_DIR=/folder/location
Now if I run spark shell: $ ./spark-shell --master yarn --deploy-mode client I get this type of error (relevant part)
$ 16/09/18 15:49:18 INFO impl.TimelineClientImpl: Timeline service address: http://URL:PORT/ws/v1/timeline/
16/09/18 15:49:18 INFO client.RMProxy: Connecting to ResourceManager at URL/IP:PORT
16/09/18 15:49:18 INFO yarn.Client: Requesting a new application from cluster with 9 NodeManagers
16/09/18 15:49:19 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (14336 MB per container)
16/09/18 15:49:19 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
16/09/18 15:49:19 INFO yarn.Client: Setting up container launch context for our AM
16/09/18 15:49:19 INFO yarn.Client: Setting up the launch environment for our AM container
16/09/18 15:49:19 INFO yarn.Client: Preparing resources for our AM container
16/09/18 15:49:19 WARN shortcircuit.DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
16/09/18 15:49:19 ERROR spark.SparkContext: Error initializing SparkContext.
org.apache.hadoop.security.AccessControlException: Permission denied: user=Menmosyne, access=WRITE, inode="/user/Mnemosyne/.sparkStaging/application_1464874056768_0040":hdfs:hdfs:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213)
However when I run simply
$ ./spark-shell
(without specifying master) I get a lot more configurations on the screen than usual (ie it should load the configurations in the hadoop folder). So if I don't specify that the master is yarn, do my spark jobs still get submitted to the yarn cluster or not?