Spark standalone mode not distributing job to other worker node

Question

I am running a spark job in standalone mode. I have configured my worker node to connect to master node. They are getting connected successfully, but when I am running the job on spark master the job is not getting distributed. I keep on getting the following message-

WARN TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

I have tried to run the job as local on the worker node and its running fine which means resources are available. Also the spark master-ui is showing that the worker has accepted the job.Password less ssh is enabled in both master and worker node to and fro. I feel it might be some firewall issue or may be spark driver port is not opened. My worker node logs show-

16/03/21 10:05:40 INFO ExecutorRunner: Launch command: "/usr/lib/jvm/java-7-oracle/bin/java" "-cp" "/mnt/pd1/spark/spark-1.5.0-bin-hadoop2.6/sbin/../conf/:/mnt/pd1/spark/spark-1.5.0-bin-hadoop2.6/lib/spark-assembly-1.5.0-hadoop2.6.0.jar:/mnt/pd1/spark/spark-1.5.0-bin-hadoop2.6/lib/datanucleus-rdbms-3.2.9.jar:/mnt/pd1/spark/spark-1.5.0-bin-hadoop2.6/lib/datanucleus-api-jdo-3.2.6.jar:/mnt/pd1/spark/spark-1.5.0-bin-hadoop2.6/lib/datanucleus-core-3.2.10.jar" "-Xms8192M" "-Xmx8192M" "-Dspark.driver.port=51810" "-Dspark.cassandra.connection.port=9042" "-XX:MaxPermSize=256m" "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" "akka.tcp://[email protected]:51810/user/CoarseGrainedScheduler" "--executor-id" "2" "--hostname" "10.0.1.194" "--cores" "4" "--app-id" "app-20160321100135-0001" "--worker-url" "akka.tcp://[email protected]:39423/user/Worker"

Executor at worker nodes shows the following log in stderr-

16/03/21 10:13:52 INFO Slf4jLogger: Slf4jLogger started 16/03/21 10:13:52 INFO Remoting: Starting remoting 16/03/21 10:13:52 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://[email protected]:59715] 16/03/21 10:13:52 INFO Utils: Successfully started service 'driverPropsFetcher' on port 59715.

you need a resource manager, if you run in standalone mode alone, the job won't be distributed. — eliasah
Spark standalone mode is a cluster manager. I am running my job on another cluster with 3 worker and 1 master node and its working fine. I feel it might be some firewall issue. How can I figure out which spark driver port is being used. — Y0gesh Gupta
Another way for this to happen, is asking for executor memory size bigger than the RAM on the machine. — Randall Whitman

AlexL AlexL · Accepted Answer · 2016-03-21T09:54:07

You can specifiy a specific driver port within the Spark context:

spark.driver.port  = "port"
val conf = new SparkConf().set("spark.driver.port", "51810")

PS: When manually starting the spark worker on the worker machine and connect it to the Master, you dont need any further passless authentication or similar between master and spark. This would only be necessarry if you use the Master for starting all slaves (start-slaves.sh). So this shouldnt be a problem.

Spark standalone mode not distributing job to other worker node

2 Answers