1
votes

I'm trying to setup a standalone Spark 2.0 server to process an analytics function in parallel. To do this I want to run 8 workers, with a single core per each worker. However, the Spark Master/Worker UI doesn't seem to be reflecting my configuration.

I'm using :

  • Standalone Spark 2.0
  • 8 Cores 24gig RAM
  • windows server 2008
  • pyspark

spark-env.sh file is configured as follows:

SPARK_WORKER_INSTANCES = 8
SPARK_WORKER_CORES = 1
SPARK_WORKER_MEMORY = 2g

spark-defaults.conf is configured as follows:

spark.cores.max = 8

I start the master:

spark-class org.apache.spark.deploy.master.Master

I start the workers by running this command 8 times within a batch file:

spark-class org.apache.spark.deploy.worker.Worker spark://10.0.0.10:7077

The problem is that the UI shows up as follows:

enter image description here

As you can see each worker has 8 cores instead of the 1 core I have assigned it via the SPARK_WORKER_CORES setting. Also the memory is reflective of the entire machine memory not the 2g assigned to each worker. How can I configure Spark to run with 1 core/2g per each worker in standalone mode?

1

1 Answers

2
votes

I fixed this to adding the cores and memory arguments to the worker itself.

start spark-class org.apache.spark.deploy.worker.Worker --cores 1 --memory 2g spark://10.0.0.10:7077