Hadoop warning and error while copying to HDFS on Amazon Aws EC2

Question

I am trying to setup Hadoop cluster on Amazon EC2. I was able to format namenode. The output of 'jps' command is following in the namenode:

5826 NameNode 6312 JobTracker 6192 SecondaryNameNode 8266 Jps

and in the slave node:

5452 DataNode 5699 Jps 5609 TaskTracker

So, I assume hadoop is running. While I am trying to copy data from local directory to HDFS with the command

hadoop fs -mkdir /user/ubuntu/clusters

hadoop fs -copyFromLocal clusters /user/ubuntu/clusters

I am getting data replication warning and other errors. The log is following

14/11/18 18:41:12 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/ubuntu/clusters/clusters could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)

    at org.apache.hadoop.ipc.Client.call(Client.java:1113)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023)

14/11/18 18:41:12 WARN hdfs.DFSClient: Error Recovery for null bad datanode[0] nodes == null 14/11/18 18:41:12 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/ubuntu/clusters/clusters" - Aborting... copyFromLocal: java.io.IOException: File /user/ubuntu/clusters/clusters could only be replicated to 0 nodes, instead of 1 14/11/18 18:41:12 ERROR hdfs.DFSClient: Failed to close file /user/ubuntu/clusters/clusters org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/ubuntu/clusters/clusters could only be replicated to 0 nodes, instead of 1 at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1920) at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:783) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:587) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1432) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1428) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1426)

    at org.apache.hadoop.ipc.Client.call(Client.java:1113)
    at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
    at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:3720)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:3580)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2600(DFSClient.java:2783)
    at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:3023)

Any help will be highly appreciated. Thanks.

Seems like your data node is running, but isn't talking to the namenode. Make sure that fs.default.name is set to your namenode address on your datanodes. — xinit
Thanks for your reply. Could you please tell me a little more about how can I set fs.default.name given I have multi-instance hadoop setup. I have done the password-less SSH setup and I can access one instance from another. And using $start-all.sh command from Namenode, I was able to run the processes in Slavenode. — Rajiur Rahman
Have you read wiki.apache.org/hadoop/GettingStartedWithHadoop ? Even though the page says it's valid for Hadoop 1.0, it still should work for HDFS 2. Make sure that the settings in hdfs-site.xml are correct. — xinit

Mr.Chowdary Mr.Chowdary · Accepted Answer · 2014-11-20T01:55:31

One of the best step by step guide to create multinode cluster in Amazon EC2 is Here

It explains each and every step. You are already done with first part seems, Go through the second part which will help you..
Hope it helps you..

Hadoop warning and error while copying to HDFS on Amazon Aws EC2

1 Answers