Why does Spark job fail on Mesos with “hadoop: not found”?

Question

I use Spark 1.6.1, Hadoop 2.6.4 and Mesos 0.28 on Debian 8.

While trying to submit a job via spark-submit to a Mesos cluster a slave fails with the following in stderr log:

I0427 22:35:39.626055 48258 fetcher.cpp:424] Fetcher Info: {"cache_directory":"\/tmp\/mesos\/fetch\/slaves\/ad642fcf-9951-42ad-8f86-cc4f5a5cb408-S0\/hduser","items":[{"action":"BYP$
I0427 22:35:39.628031 48258 fetcher.cpp:379] Fetching URI 'hdfs://xxxxxxxxx:54310/sources/spark/SimpleEventCounter.jar'
I0427 22:35:39.628057 48258 fetcher.cpp:250] Fetching directly into the sandbox directory
I0427 22:35:39.628078 48258 fetcher.cpp:187] Fetching URI 'hdfs://xxxxxxx:54310/sources/spark/SimpleEventCounter.jar'
E0427 22:35:39.629243 48258 shell.hpp:93] Command 'hadoop version 2>&1' failed; this is the output:
sh: 1: hadoop: not found
Failed to fetch 'hdfs://xxxxxxx:54310/sources/spark/SimpleEventCounter.jar': Failed to create HDFS client: Failed to execute 'hadoop version 2>&1'; the command was e$
Failed to synchronize with slave (it's probably exited)

My Jar file contains hadoop 2.6 binaries
The path to spark executor/binary is via an hdfs:// link

My jobs don't appear in the framework tab, but they do appear in the driver with the status 'queued' and they just sit there till I shut down the spark-mesos-dispatcher.sh service.

Have you configured hadoop_home on mesos slaves? It seems it is unable to find hadoop home on mesos slaves! — avr
There's a similar issue on Mesos' JIRA. Check if curl is installed on your machine(s) — Tobi
How do you spark-submit, i.e. can you show the entire command line? — Jacek Laskowski
For the time being i moved to Yarn to get the jobs running, i'll come back to this later in the week. Apologies for the delay. FYI, curl is installed on all the machines and hadoop_home is also configured. ./spark-submit --class EventCounter --master mesos://xxxxx:7077 --deploy-mode client --supervise --executor-memory 500m hdfs://xxxxx:54310/sources/spark/SimpleEventCounter.jar — Yasir K

Hardy Hardy · Accepted Answer · 2016-05-17T22:22:34

I was seeing a very similar error and I figured out my problem was that hadoop_home wasn't set in the mesos agent. I added to /etc/default/mesos-slave (path may be different on your install) on each mesos-slave the following line: MESOS_hadoop_home="/path/to/my/hadoop/install/folder/"

EDIT: Hadoop has to be installed on each slave, the path/to/my/haoop/install/folder is a local path

Why does Spark job fail on Mesos with “hadoop: not found”?

1 Answers