I first installed hadoop 2.2 on my machine (called Abhishek-PC) and everything worked fine. I am able to run the entire system successfully. (both namenode and datanode).
Now I created 1 VM hdclient1 and I want to add this VM as a data node.
Here are the steps which I have followed
I setup SSH successfully and I can ssh into hdclient1 without a password and I can login from hdclient1 into my main machine without a password.
I setup hadoop 2.2 on this VM and I modified the configuration files as per many tutorials on the web. Here are my configuration files
Name Node configuration
https://drive.google.com/file/d/0B0dV2NMSGYPXdEM1WmRqVG5uYlU/edit?usp=sharing
Data Node configuration
https://drive.google.com/file/d/0B0dV2NMSGYPXRnh3YUo1X2Frams/edit?usp=sharing
- Now when I start start-dfs.sh on my first machine, I can see that DataNode starts successfully on hdclient1. Here is a screenshot from my hadoop console.
https://drive.google.com/file/d/0B0dV2NMSGYPXOEJ3UV9SV1d5bjQ/edit?usp=sharing
As you can see both the machines appear in my cluster (main main and data node).
Although both are called "localhost" for some strange reason.
- I can see that the logs are being created on hdclient1in those logs there are no exceptions.
here are the logs from the name node
https://drive.google.com/file/d/0B0dV2NMSGYPXM0dZTWVRUWlGaDg/edit?usp=sharing
Here are the logs from the data node
https://drive.google.com/file/d/0B0dV2NMSGYPXNV9wVmZEcUtKVXc/edit?usp=sharing
- I can login to the namenode UI successfully
http://Abhishek-PC:50070
but here the UI in the live nodes it says only 1 live node and there is no mention of hdclient1.
https://drive.google.com/file/d/0B0dV2NMSGYPXZmMwM09YQlI4RzQ/edit?usp=sharing
I can create a directory in hdfs successfully
hadoop fs -mkdir /smallFrom the datanode I can see that this directory has been created by using this command
hadoop fs -ls /Now when I try to add a file to my HDFS and I say
hadoop fs -copyFromLocal ~/Downloads/book/war_and_peace.txt /small
i get an error message
abhishek@Abhishek-PC:~$ hadoop fs -copyFromLocal ~/Downloads/book/war_and_peace.txt /small 14/01/04 20:07:41 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/01/04 20:07:41 WARN hdfs.DFSClient: DataStreamer Exception org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /small/war_and_peace.txt.COPYING could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1384) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2477) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:555)
So my question is What am I doing wrong here? Why do I get this exception when I try to copy the file into HDFS?
20731 SecondaryNameNode 21198 NodeManager 20389 NameNode 21057 ResourceManager 21302 Jps 20525 DataNodeon hdclient1 and hdclient2 jps results in4825 DataNode 4960 NodeManager 5115 Jps. Yes. I can ping all nodes from all places. and I can also do ssh from all nodes to all other nodes. - Knows Not Muchhadoop fs -mkdir /fooand then on my VMs (hdclient1, hdclient2) I do ahadoop fs -ls /I can see that foo being listed. so ofcourse, not only the datanode is being started there.... hadoop is working across machines. but now if I do ahadoop dfsadmin -printTopologyit says there is only 1 data node and not 3 data node. - Knows Not Much