55
votes

I was using Hadoop in a pseudo-distributed mode and everything was working fine. But then I had to restart my computer because of some reason. And now when I am trying to start Namenode and Datanode I can find only Datanode running. Could anyone tell me the possible reason of this problem? Or am I doing something wrong?

I tried both bin/start-all.sh and bin/start-dfs.sh.

22
i reformatted the HDFS and now i am able to start both Namonode and Datanode.but once i am using Hadoop for my project i can't reformat the HDFS..i need some permanent solution..user886908

22 Answers

102
votes

I was facing the issue of namenode not starting. I found a solution using following:

  1. first delete all contents from temporary folder: rm -Rf <tmp dir> (my was /usr/local/hadoop/tmp)
  2. format the namenode: bin/hadoop namenode -format
  3. start all processes again:bin/start-all.sh

You may consider rolling back as well using checkpoint (if you had it enabled).

37
votes

hadoop.tmp.dir in the core-site.xml is defaulted to /tmp/hadoop-${user.name} which is cleaned after every reboot. Change this to some other directory which doesn't get cleaned on reboot.

26
votes

Following STEPS worked for me with hadoop 2.2.0,

STEP 1 stop hadoop

hduser@prayagupd$ /usr/local/hadoop-2.2.0/sbin/stop-dfs.sh

STEP 2 remove tmp folder

hduser@prayagupd$ sudo rm -rf /app/hadoop/tmp/

STEP 3 create /app/hadoop/tmp/

hduser@prayagupd$ sudo mkdir -p /app/hadoop/tmp
hduser@prayagupd$ sudo chown hduser:hadoop /app/hadoop/tmp
hduser@prayagupd$ sudo chmod 750 /app/hadoop/tmp

STEP 4 format namenode

hduser@prayagupd$ hdfs namenode -format

STEP 5 start dfs

hduser@prayagupd$ /usr/local/hadoop-2.2.0/sbin/start-dfs.sh

STEP 6 check jps

hduser@prayagupd$ $ jps
11342 Jps
10804 DataNode
11110 SecondaryNameNode
10558 NameNode
4
votes

In conf/hdfs-site.xml, you should have a property like

<property>
    <name>dfs.name.dir</name>
    <value>/home/user/hadoop/name/data</value>
</property>

The property "dfs.name.dir" allows you to control where Hadoop writes NameNode metadata. And giving it another dir rather than /tmp makes sure the NameNode data isn't being deleted when you reboot.

3
votes

Open a new terminal and start the namenode using path-to-your-hadoop-install/bin/hadoop namenode

The check using jps and namenode should be running

2
votes

Why do most answers here assume that all data needs to be deleted, reformatted, and then restart Hadoop? How do we know namenode is not progressing, but taking lots of time. It will do this when there is a large amount of data in HDFS. Check progress in logs before assuming anything is hung or stuck.

$ [kadmin@hadoop-node-0 logs]$ tail hadoop-kadmin-namenode-hadoop-node-0.log

...
016-05-13 18:16:44,405 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 117/141 transactions completed. (83%)
2016-05-13 18:16:56,968 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 121/141 transactions completed. (86%)
2016-05-13 18:17:06,122 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 122/141 transactions completed. (87%)
2016-05-13 18:17:38,321 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 123/141 transactions completed. (87%)
2016-05-13 18:17:56,562 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 124/141 transactions completed. (88%)
2016-05-13 18:17:57,690 INFO org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader: replaying edit log: 127/141 transactions completed. (90%)

This was after nearly an hour of waiting on a particular system. It is still progressing each time I look at it. Have patience with Hadoop when bringing up the system and check logs before assuming something is hung or not progressing.

2
votes

In core-site.xml:

    <configuration>
       <property>
          <name>fs.defaultFS</name>
          <value>hdfs://localhost:9000</value>
       </property>
       <property>
          <name>hadoop.tmp.dir</name>
          <value>/home/yourusername/hadoop/tmp/hadoop-${user.name}
         </value>
  </property>
</configuration>

and format of namenode with :

hdfs namenode -format

worked for hadoop 2.8.1

1
votes

If anyone using hadoop1.2.1 version and not able to run namenode, go to core-site.xml, and change dfs.default.name to fs.default.name.

And then format the namenode using $hadoop namenode -format.

Finally run the hdfs using start-dfs.sh and check for service using jps..

0
votes

Did you change conf/hdfs-site.xml dfs.name.dir?

Format namenode after you change it.

$ bin/hadoop namenode -format
$ bin/hadoop start-all.sh
0
votes

If you facing this issue after rebooting the system, Then below steps will work fine

For workaround.

1) format the namenode: bin/hadoop namenode -format

2) start all processes again:bin/start-all.sh

For Perm fix: -

1) go to /conf/core-site.xml change fs.default.name to your custom one.

2) format the namenode: bin/hadoop namenode -format

3) start all processes again:bin/start-all.sh

0
votes

Faced the same problem.

(1) Always check for the typing mistakes in the configuring the .xml files, especially the xml tags.

(2) go to bin dir. and type ./start-all.sh

(3) then type jps , to check if processes are working

0
votes

Add hadoop.tmp.dir property in core-site.xml

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/yourname/hadoop/tmp/hadoop-${user.name}</value>
  </property>
</configuration>

and format hdfs (hadoop 2.7.1):

$ hdfs namenode -format

The default value in core-default.xml is /tmp/hadoop-${user.name}, which will be deleted after reboot.

0
votes

Try this,

1) Stop all hadoop processes : stop-all.sh

2) Remove the tmp folder manually

3) Format namenode : hadoop namenode -format

4) Start all processes : start-all.sh

0
votes

If you kept default configurations when running hadoop the port for the namenode would be 50070. You will need to find any processes running on this port and kill them first.

  • Stop all running hadoop with : bin/stop-all.sh

    check all processes running in port 50070

  • sudo netstat -tulpn | grep :50070 #check any processes running in port 50070, if there are any the / will appear at the RHS of the output.

  • sudo kill -9 <process_id> #kill_the_process.

  • sudo rm -r /app/hadoop/tmp #delete the temp folder

  • sudo mkdir /app/hadoop/tmp #recreate it

  • sudo chmod 777 –R /app/hadoop/tmp (777 is given for this example purpose only)

  • bin/hadoop namenode –format #format hadoop namenode

  • bin/start-all.sh #start-all hadoop services

Refer this blog

0
votes

For me the following worked after I changed the directory of the namenode and datanode in hdfs-site.xml

-- before executing the following steps stop all services with stop-all.sh or in my case I used the stop-dfs.sh to stop the dfs

  1. On the new configured directory, for every node (namenode and datanode), delete every folder/files inside it (in my case a 'current' directory).
  2. delete the Hadoop temporary directory: $rm -rf /tmp/haddop-$USER
  3. format the Namenode: hadoop/bin/hdfs namenode -format
  4. start-dfs.sh

After I followed those steps my namenode and datanodes were alive using the new configured directory.

0
votes

I ran $hadoop namenode to start namenode manually at foreground.

From the logs I figured out that 50070 is ocuupied, which was defaultly used by dfs.namenode.http-address. After configuring dfs.namenode.http-address in hdfs-site.xml, everything went well.

0
votes
I got the solution just share with you that will work who got the errors:

1. First check the /home/hadoop/etc/hadoop path, hdfs-site.xml and

 check the path of namenode and datanode 

<property>
  <name>dfs.name.dir</name>
    <value>file:///home/hadoop/hadoopdata/hdfs/namenode</value>
</property>

<property>
  <name>dfs.data.dir</name>
    <value>file:///home/hadoop/hadoopdata/hdfs/datanode</value>
</property>

2.Check the permission,group and user of namenode and datanode of the particular path(/home/hadoop/hadoopdata/hdfs/datanode), and check if there are any problems in all of them and if there are any mismatch then correct it. ex .chown -R hadoop:hadoop in_use.lock, change user and group

chmod -R 755 <file_name> for change the permission
0
votes

After deleting a resource managers' data folder, the problem is gone.
Even if you have formatting cannot solve this problem.

0
votes

If your namenode is stuck in safemode you can ssh to namenode, su hdfs user and run the following command to turn off safemode:

hdfs dfsadmin -fs hdfs://server.com:8020 -safemode leave
0
votes

Instead of formatting namenode, may be you can use the below command to restart the namenode. It worked for me:

sudo service hadoop-master restart

  1. hadoop dfsadmin -safemode leave
0
votes

I was facing the same issue of namenode not starting with Hadoop-3.2.1**** version. I did the steps to resolve the issue:

  1. Delete the contents from temporary folder from the name node directory. In my case the "current" directory made by root user: rm -rf (dir name)

  2. Format the namenode: hdfs namenode -format

  3. start the processes again:start-dfs.sh

Point #1 has change in the hdfs-site.xml file.

<property>
    <name>dfs.namenode.name.dir</name>
    <value>file:///opt/hadoop/node-data/hdfs/namenode</value>
</property>
-1
votes

I ran into the same thing after a restart.

for hadoop-2.7.3 all I had to do was format the namenode:

<HadoopRootDir>/bin/hdfs namenode -format

Then a jps command shows

6097 DataNode
755 RemoteMavenServer
5925 NameNode
6293 SecondaryNameNode
6361 Jps