6
votes

I want to create a spark standalone cluster. I am able to run master and slave on same node, but the slave on different node is neither showing master-URL nor connecting to master.

I am running command:

start-slave.sh spark://spark-server:7077

where spark-server is the hostname of my master.

I am able to ping master from worker, but the WebUI of master isn't showing any worker except that running on same machine. The client node is running a worker but it is independent and not connected to the master.

4
If you like my answer, don't be tight-lipped ... pls vote/accept the answer as owner.Bhavesh

4 Answers

10
votes

Please check configuration file "spark-env.sh" on your master node. Have you set the SPARK_MASTER_HOST variable to the IP address of the master node? If not try to set it and restart the master and slaves. For example, if your master node's IP is 192.168.0.1, you should have SPARK_MASTER_HOST=192.168.0.1 in there. Note that you don't need to set this variable on your slaves.

4
votes

1) Make sure you set a password less SSH between nodes

Please refer the below link to setup a password less ssh between nodes

http://www.tecmint.com/ssh-passwordless-login-using-ssh-keygen-in-5-easy-steps/

2) Specify the slaves IP Address in slaves file present in $SPARK_HOME/conf directory

[This is the spark folder containing conf directory] on Master node

3) Once you specify the IP Address in slaves file start the spark cluster

[Execute the start-all.sh script present in $SPARK_HOME/sbin directory] on Master Node

Hope this Helps

2
votes

If you are able to ping the master node from Worker means it has the network connectivity .The new worker node needs to be added in Spark master you need to update few things spark-env.sh Please check the official document Spark CLuster launch and update the reuired fileds .

Here is another blog which can help you Spark Cluster modeBlog

0
votes

This solved my problem:

The idea is to use loopback address when both client and server are on the same machine.

Steps:

  • go to the conf folder in your spark-hadoop directory, and check if spark-env.sh is present if not then copy of spark-env.sh.template and name as spark-env.sh, then add SPARK_MASTER_HOST=127.0.0.1
  • then run the command to start the master from the directory (not conf folder)
  • ./sbin/start-master.sh (this will start the master, view it in localhost:8080)
  • bin/spark-class org.apache.spark.deploy.worker.Worker spark://127.0.0.1:7077 (this will start the worker and you can see it listed under the worker tab in the same web UI i.e, localhost:8080)
  • you can add multiple workers with the above command

This worked for me, hopefully, this will work for you too.