Hope you all had a wonderful vacation. I am trying to setup Hadoop cluster on Amazon EC2. While copying data file from local disk to hdfs with the command hadoop fs -copyFromLocal d.txt /user/ubuntu/data
, I am getting data replication error. The error from the log is following
15/01/06 07:40:36 WARN hdfs.DFSClient: Error Recovery for null bad datanode[0] nodes == null
15/01/06 07:40:36 WARN hdfs.DFSClient: Could not get block locations. Source file /user/ubuntu/data/d.txt" - > Aborting... copyFromLocal: java.io.IOException: File /user/ubuntu/data/d.txt could only be replicated to 0 nodes, instead of 1
15/01/06 07:40:36 ERROR hdfs.DFSClient: Failed to close file /user/ubuntu/data/d.txt
Now, I had been checking StackOverFlow and other forums about this problem and I found most of them talk about DataNode, TaskTracker not running as a probable cause & relevant solutions. But these things are running fine in my setup. The screenshot of the JPS command http://i.imgur.com/vS6kRPP.png
From HadooWiki, the other possible causes are DataNode not able talk to the server, through networking or Hadoop configuration problems or some configuration problem is preventing effective two-way communication.
I have configured hadoop-env.sh, core-site.xml, hdfs-site.xml and mapred-site.xml following the tutorial http://tinyurl.com/l2wv6y9 . Could anyone tell please me where I am going wrong ? I will be immensely grateful if anyone help me to resolve the problem.
Thanks,