1
votes

I first installed hadoop 2.2 on my machine (called Abhishek-PC) and everything worked fine. I am able to run the entire system successfully. (both namenode and datanode).

Now I created 1 VM hdclient1 and I want to add this VM as a data node.

Here are the steps which I have followed

  1. I setup SSH successfully and I can ssh into hdclient1 without a password and I can login from hdclient1 into my main machine without a password.

  2. I setup hadoop 2.2 on this VM and I modified the configuration files as per many tutorials on the web. Here are my configuration files

Name Node configuration

https://drive.google.com/file/d/0B0dV2NMSGYPXdEM1WmRqVG5uYlU/edit?usp=sharing

Data Node configuration

https://drive.google.com/file/d/0B0dV2NMSGYPXRnh3YUo1X2Frams/edit?usp=sharing

  1. Now when I start start-dfs.sh on my first machine, I can see that DataNode starts successfully on hdclient1. Here is a screenshot from my hadoop console.

https://drive.google.com/file/d/0B0dV2NMSGYPXOEJ3UV9SV1d5bjQ/edit?usp=sharing

As you can see both the machines appear in my cluster (main main and data node).

Although both are called "localhost" for some strange reason.

  1. I can see that the logs are being created on hdclient1in those logs there are no exceptions.

here are the logs from the name node

https://drive.google.com/file/d/0B0dV2NMSGYPXM0dZTWVRUWlGaDg/edit?usp=sharing

Here are the logs from the data node

https://drive.google.com/file/d/0B0dV2NMSGYPXNV9wVmZEcUtKVXc/edit?usp=sharing

  1. I can login to the namenode UI successfully http://Abhishek-PC:50070

but here the UI in the live nodes it says only 1 live node and there is no mention of hdclient1.

https://drive.google.com/file/d/0B0dV2NMSGYPXZmMwM09YQlI4RzQ/edit?usp=sharing

  1. I can create a directory in hdfs successfully hadoop fs -mkdir /small

  2. From the datanode I can see that this directory has been created by using this command hadoop fs -ls /

  3. Now when I try to add a file to my HDFS and I say

hadoop fs -copyFromLocal ~/Downloads/book/war_and_peace.txt /small

i get an error message

abhishek@Abhishek-PC:~$ hadoop fs -copyFromLocal ~/Downloads/book/war_and_peace.txt /small 14/01/04 20:07:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/01/04 20:07:41 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /small/war_and_peace.txt.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)

So my question is What am I doing wrong here? Why do I get this exception when I try to copy the file into HDFS?

1
What does 'jps' show on each machine/VM? Can you ping all boxes from all boxes? What's the IP of the master/namenode on each box? Helps to have the same etc/hosts on each machine. We have a 3-node cluster functioning for a while and can provide settings tomorrow. - Vishal
Also, please try starting services one by one, 'hadoop-daemon start namenode', hadoop-daemons.sh start datanode, 'yarn-daemon.sh start resourcemanager' and 'yarn-daemon.sh start nodemanager' and running 'jps' after each to check the status. - Vishal
On first machine (namenode+datanode) jps results in 20731 SecondaryNameNode 21198 NodeManager 20389 NameNode 21057 ResourceManager 21302 Jps 20525 DataNode on hdclient1 and hdclient2 jps results in 4825 DataNode 4960 NodeManager 5115 Jps. Yes. I can ping all nodes from all places. and I can also do ssh from all nodes to all other nodes. - Knows Not Much
Does your namenode machine have a firewall preventing the other datanodes from connecting to it? Do your data node logs from the newly added VMs show successful connection to the name node? - Chris White
There is no firewall. on my main machine if I do a hadoop fs -mkdir /foo and then on my VMs (hdclient1, hdclient2) I do a hadoop fs -ls / I can see that foo being listed. so ofcourse, not only the datanode is being started there.... hadoop is working across machines. but now if I do a hadoop dfsadmin -printTopology it says there is only 1 data node and not 3 data node. - Knows Not Much

1 Answers

0
votes

We have a 3-node cluster (all physical boxes) that's been working great for a couple of months. This article helped me the most to setup.