35
votes

I am trying to setup a single-node Hadoop 2.6.0 cluster on my PC.

On visiting http://localhost:8088/cluster, I find that my node is listed as an "unhealthy node".

In the health report, it provides the error:

1/1 local-dirs are bad: /tmp/hadoop-hduser/nm-local-dir; 
1/1 log-dirs are bad: /usr/local/hadoop/logs/userlogs

What's wrong?

8
This won't fix the root cause, but will get you going for the time being: Add property 'yarn.nodemanager.disk-health-checker.min-healthy-disks' in yarn-site.xml and set value to 0.Tushar Sudake

8 Answers

64
votes

The most common cause of local-dirs are bad is due to available disk space on the node exceeding yarn's max-disk-utilization-per-disk-percentage default value of 90.0%.

Either clean up the disk that the unhealthy node is running on, or increase the threshold in yarn-site.xml

<property>
  <name>yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage</name>
  <value>98.5</value>
</property>

Avoid disabling disk check, because your jobs may failed when the disk eventually run out of space, or if there are permission issues. Refer to the yarn-site.xml Disk Checker section for more details.

FSCK

If you suspect there is filesystem error on the directory, you can check by running

hdfs fsck /tmp/hadoop-hduser/nm-local-dir
7
votes

Please try to add the config in yarn-site.xml

<property>
   <name>yarn.nodemanager.disk-health-checker.enable</name>
   <value>false</value>
</property>

It can work on my site.

And rm the /usr/local/hadoop/logs. ex:

rm -rf /usr/local/hadoop/logs
mkdir -p /usr/local/hadoop/logs
3
votes

It can be also caused by the wrong log directory location configured by yarn.nodemanager.log-dirs in yarn-site.xml. Either by the fact directory does not exist or has wrong permissions set.

3
votes

I had similar issue at first.

Then I also found another problem. When I used jps command some processes like NameNode, DataNode etc. were missing.

$jps
13696 Jps
12949 ResourceManager
13116 NodeManager

Then I fixed it from the following solution and the unhealthy node issue was automatically fixed.

1
votes

On macOS with Hadoop installed using brew I had to change /usr/local/Cellar/hadoop/2.8.1/libexec/etc/hadoop/yarn-site.xml to include the following:

<property>
  <name>yarn.nodemanager.disk-health-checker.min-healthy-disks</name>
  <value>0</value>
</property>

The setting has basically turned the disk health check off completely

I found the file using brew list hadoop.

$ brew list hadoop | grep yarn-site.xml
/usr/local/Cellar/hadoop/2.8.1/libexec/etc/hadoop/yarn-site.xml
/usr/local/Cellar/hadoop/2.8.1/libexec/share/hadoop/tools/sls/sample-conf/yarn-site.xml
0
votes

I had a similar problem, sqoop upload just hanged when hdfs reached 90%. After I changed a treshold for max-disk-utilization-per-disk-percentage and alarm treshold definitions upload is working again. Thanks

0
votes

I experienced this when the disk is 90% (using >df) and I take off unnecessary files so it became 85% (the default setting for yarn.nodemanager.disk-health-checker.max-disk-utilization-per-disk-percentage is using 90% of available disk if you do not specify in yarn-site.xml) and the problem is solved.

The effect is similar to increase utilization to over 90% (so to squeeze extra available space in my case was 90% full) just to squeeze extra space. However it is good practice not to reach over 90% anyway.

0
votes

Had same issue, list my causes, FYR:

  1. dirs not exists, mkdir first,
  2. memory-mb set is too larger than available
    <property>
        <name>yarn.nodemanager.local-dirs</name>
        <value>/tmp/yarn/nm</value>
    </property>
    <property>
        <name>yarn.nodemanager.log-dirs</name>
        <value>/tmp/yarn/container-logs</value>
    </property>

    <property>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>364000</value>
    </property>