7
votes

I have successfully setup a Hadoop cluster with 6 nodes (master, salve<1-5>)

  • Formatted the namenode -> done
  • Starting up and shutting down cluster -> works fine
  • Executing "hadoop dfs -ls /" gives this error -> Error: INFO ipc.Client: Retrying connect to server: localhost

I tried to see the services running using:

sudo netstat -plten | grep java
hduser@ubuntu:~$ sudo netstat -plten | grep java

tcp 0 0 0.0.0.0:50070 0.0.0.0:* LISTEN 1000 93307 11384/java
tcp 0 0 0.0.0.0:44440 0.0.0.0:* LISTEN 1000 92491 11571/java
tcp 0 0 0.0.0.0:40633 0.0.0.0:* LISTEN 1000 92909 11758/java
tcp 0 0 0.0.0.0:50010 0.0.0.0:* LISTEN 1000 93449 11571/java
tcp 0 0 0.0.0.0:50075 0.0.0.0:* LISTEN 1000 93673 11571/java
tcp 0 0 0.0.0.0:50020 0.0.0.0:* LISTEN 1000 93692 11571/java
tcp 0 0 127.0.0.1:40485 0.0.0.0:* LISTEN 1000 93666 12039/java
tcp 0 0 0.0.0.0:44582 0.0.0.0:* LISTEN 1000 93013 11852/java
tcp 0 0 10.42.43.1:54310 0.0.0.0:* LISTEN 1000 92471 11384/java
tcp 0 0 10.42.43.1:54311 0.0.0.0:* LISTEN 1000 93290 11852/java
tcp 0 0 0.0.0.0:50090 0.0.0.0:* LISTEN 1000 93460 11758/java
tcp 0 0 0.0.0.0:34154 0.0.0.0:* LISTEN 1000 92179 11384/java
tcp 0 0 0.0.0.0:50060 0.0.0.0:* LISTEN 1000 94200 12039/java
tcp 0 0 0.0.0.0:50030 0.0.0.0:* LISTEN 1000 93550 11852/java

Its the master IP binded to port 54310 and 54311 and not the localhost(loopback).

The conf-site.xml has been properly configured:

<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hduser/hadoop/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://master:54310</value>
</property>
</configuration>

Why is it expecting localhost to be binded to 54310 rather than the master hich I have configured here. Help appreciated. How do I resolve this ??

Cheers

3

3 Answers

1
votes

Apparently, someone added the older hadoop(1.0.3) bin directory into the path variable before I had added the new hadoop(1.0.4) bin directory. And thus whenever I ran "hadoop" from the CLI, it executed the binaries of the older hadoop rather than the new one.

Solution:

  • Remove the entire bin path of older hadoop

  • Shutdown cluster - Exit terminal

  • Login in new terminal session

  • Startup node

  • Tried hadoop dfs -ls / -> Works fine !!!! Good lesson learnt.

0
votes

Looks like many people ran into this problem.

There might be no need to change /etc/hosts, and make sure you can access master and slave from each other, and your core-site.xml are the same pointing to the right master node and port number.

Then run $HADOOP/bin/stop-all.sh, $HADOOP/bin/start-all.sh on master node ONLY. (If run on slave might lead to problems). Use JPS to check whether all services are there as follows.

On master node: 4353 DataNode 4640 JobTracker 4498 SecondaryNameNode 4788 TaskTracker 4989 Jps 4216 NameNode

On slave node: 3143 Jps 2827 DataNode 2960 TaskTracker

0
votes

In addition, check your firewall rules between namenode and datanode