I'm setting up a [somewhat ad-hoc] cluster of Spark workers: namely, a couple of lab machines that I have sitting around. However, I've run into a problem when I attempt to start the cluster with start-all.sh
: namely, Spark is installed in different directories on the various workers. But the master invokes $SPARK_HOME/sbin/start-all.sh
on each one using the master's definition of $SPARK_HOME
, even though the path is different for each worker.
Assuming I can't install Spark on identical paths on each worker to the master, how can I get the master to recognize the different worker paths?
EDIT #1 Hmm, found this thread in the Spark mailing list, strongly suggesting that this is the current implementation--assuming $SPARK_HOME
is the same for all workers.
log4j.properties
per worker that I can't seem to overcome. This isn't what I'd use in reality, but for mucking around and understanding what's going on it would be of help – Brad