I think I must have some misunderstanding about the datanodes in Hadoop Cluster. I have a hadoop virtural cluster composed of master,slave1,slave2, slave3. Master and slave1 are in a phsical machine while slave2 and slave3 are in one physical machine. When I start the cluster, in the HDFS webUI, I can only see three living datanodes, slave1,master, slave2. But sometimes, the three living datanodes are master, slave1,slave3. That's strange. I ssh to the unstarted data node, though I execute jps and found no datanode, I can still copy and delete files on HDFS on this node. So I believe I must not understand datanode correctly. I have three questions here. 1 Is there one datanode per node? 2 Why the node which is not datanode can still read and write on HDFS? 3 can we decide the number of datanode?
Here is the log of the unstarted datanode:
STARTUP_MSG: Starting DataNode STARTUP_MSG: host = slave11/192.168.111.31 STARTUP_MSG: args = [] STARTUP_MSG: version = 1.0.3 STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch- 1.0 -r 1335192; compiled by 'hortonfo' on Tue May 8 20:31:25 UTC 2012 ************************************************************/ 2012-08-03 17:47:07,578 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2012-08-03 17:47:07,595 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered. 2012-08-03 17:47:07,596 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2012-08-03 17:47:07,596 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started 2012-08-03 17:47:07,911 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered. 2012-08-03 17:47:07,915 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists! 2012-08-03 17:47:09,457 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.111.21:54310. Already tried 0 time(s). 2012-08-03 17:47:10,460 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.111.21:54310. Already tried 1 time(s). 2012-08-03 17:47:11,464 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: master/192.168.111.21:54310. Already tried 2 time(s). 2012-08-03 17:47:19,565 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Registered FSDatasetStatusMBean 2012-08-03 17:47:19,601 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Opened info server at 50010 2012-08-03 17:47:19,620 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Balancing bandwith is 1048576 bytes/s 2012-08-03 17:47:24,721 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog 2012-08-03 17:47:24,854 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter) 2012-08-03 17:47:24,952 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dfs.webhdfs.enabled = false 2012-08-03 17:47:24,953 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50075 2012-08-03 17:47:24,953 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50075 webServer.getConnectors()[0].getLocalPort() returned 50075 2012-08-03 17:47:24,953 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50075 2012-08-03 17:47:24,953 INFO org.mortbay.log: jetty-6.1.26 2012-08-03 17:47:25,665 INFO org.mortbay.log: Started [email protected]:500752012-08-03 17:47:25,688 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered. 2012-08-03 17:47:25,690 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source DataNode registered. 2012-08-03 17:47:30,717 INFO org.apache.hadoop.ipc.Server: Starting SocketReader 2012-08-03 17:47:30,718 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort50020 registered. 2012-08-03 17:47:30,718 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort50020 registered. 2012-08-03 17:47:30,721 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: dnRegistration = DatanodeRegistration(slave11:50010, storageID=DS-1062340636-127.0.0.1-50010-1339803955209, infoPort=50075, ipcPort=50020) 2012-08-03 17:47:30,764 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting asynchronous block report scan 2012-08-03 17:47:30,766 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.111.31:50010, storageID=DS-1062340636-127.0.0.1-50010-1339803955209, infoPort=50075, ipcPort=50020)In DataNode.run, data = FSDataset{dirpath='/app/hadoop/tmp/dfs/data/current'} 2012-08-03 17:47:30,774 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: using BLOCKREPORT_INTERVAL of 3600000msec Initial delay: 0msec 2012-08-03 17:47:30,778 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: starting 2012-08-03 17:47:30,772 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting 2012-08-03 17:47:30,773 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 50020: starting 2012-08-03 17:47:30,773 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: starting 2012-08-03 17:47:30,773 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020: starting 2012-08-03 17:47:30,795 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Starting Periodic block scanner. 2012-08-03 17:47:30,816 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Finished asynchronous block report scan in 52ms 2012-08-03 17:47:30,838 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Generated rough (lockless) block report in 32 ms 2012-08-03 17:47:30,840 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Reconciled asynchronous block report against current state in 2 ms 2012-08-03 17:47:31,158 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Verification succeeded for blk_-6072482390929551157_78209 2012-08-03 17:47:33,775 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Reconciled asynchronous block report against current state in 1 ms 2012-08-03 17:47:33,793 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DataNode is shutting down: org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.hdfs.protocol.UnregisteredDatanodeException: Data node 192.168.111.31:50010 is attempting to report storage ID DS-1062340636-127.0.0.1-50010-1339803955209. Node 192.168.111.32:50010 is expected to serve this storage. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getDatanode(FSNamesystem.java:4608) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.processReport(FSNamesystem.java:3460) at org.apache.hadoop.hdfs.server.namenode.NameNode.blockReport(NameNode.java:1001) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:416) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)
at org.apache.hadoop.ipc.Client.call(Client.java:1070)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy5.blockReport(Unknown Source)
at org.apache.hadoop.hdfs.server.datanode.DataNode.offerService(DataNode.java:958)
at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1458)
at java.lang.Thread.run(Thread.java:636)
2012-08-03 17:47:33,873 INFO org.mortbay.log: Stopped [email protected]:50075 2012-08-03 17:47:33,980 INFO org.apache.hadoop.ipc.Server: Stopping server on 50020 2012-08-03 17:47:33,981 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: exiting 2012-08-03 17:47:33,981 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: exiting 2012-08-03 17:47:33,981 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020: exiting
2012-08-03 17:47:33,981 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 50020: exiting
2012-08-03 17:47:33,981 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 50020: exiting
2012-08-03 17:47:33,981 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 50020: exiting
2012-08-03 17:47:33,981 INFO org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down
2012-08-03 17:47:33,982 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.111.31:50010, storageID=DS-1062340636-127.0.0.1-50010-1339803955209, infoPort=50075, ipcPort=50020):DataXceiveServer:java.nio.channels.AsynchronousCloseException
at java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:170)
at sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:102)
at org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:131)
at java.lang.Thread.run(Thread.java:636)
2012-08-03 17:47:33,982 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 50020
2012-08-03 17:47:33,982 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting DataXceiveServer
2012-08-03 17:47:33,983 INFO org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2012-08-03 17:47:33,982 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Waiting for threadgroup to exit, active threads is 1 2012-08-03 17:47:33,984 INFO org.apache.hadoop.hdfs.server.datanode.DataBlockScanner: Exiting DataBlockScanner thread. 2012-08-03 17:47:33,985 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: Shutting down all async disk service threads... 2012-08-03 17:47:33,985 INFO org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: All async disk service threads have been shut down. 2012-08-03 17:47:33,985 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: DatanodeRegistration(192.168.111.31:50010, storageID=DS-1062340636-127.0.0.1-50010-1339803955209, infoPort=50075, ipcPort=50020):Finishing DataNode in: FSDataset{dirpath='/app/hadoop/tmp/dfs/data/current'} 2012-08-03 17:47:33,987 WARN org.apache.hadoop.metrics2.util.MBeans: Hadoop:service=DataNode,name=DataNodeInfo javax.management.InstanceNotFoundException: Hadoop:service=DataNode,name=DataNodeInfo at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1118) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:433) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:421) at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:540) at org.apache.hadoop.metrics2.util.MBeans.unregister(MBeans.java:71) at org.apache.hadoop.hdfs.server.datanode.DataNode.unRegisterMXBean(DataNode.java:522) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:737) at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1471) at java.lang.Thread.run(Thread.java:636) 2012-08-03 17:47:33,988 INFO org.apache.hadoop.ipc.Server: Stopping server on 50020 2012-08-03 17:47:33,988 INFO org.apache.hadoop.ipc.metrics.RpcInstrumentation: shut down 2012-08-03 17:47:33,988 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Waiting for threadgroup to exit, active threads is 0 2012-08-03 17:47:33,988 WARN org.apache.hadoop.metrics2.util.MBeans: Hadoop:service=DataNode,name=FSDatasetState-DS-1062340636-127.0.0.1-50010-1339803955209 javax.management.InstanceNotFoundException: Hadoop:service=DataNode,name=FSDatasetState-DS-1062340636-127.0.0.1-50010-1339803955209 at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1118) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.exclusiveUnregisterMBean(DefaultMBeanServerInterceptor.java:433) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.unregisterMBean(DefaultMBeanServerInterceptor.java:421)
at com.sun.jmx.mbeanserver.JmxMBeanServer.unregisterMBean(JmxMBeanServer.java:540) at org.apache.hadoop.metrics2.util.MBeans.unregister(MBeans.java:71) at org.apache.hadoop.hdfs.server.datanode.FSDataset.shutdown(FSDataset.java:2067) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdown(DataNode.java:799) at org.apache.hadoop.hdfs.server.datanode.DataNode.run(DataNode.java:1471) at java.lang.Thread.run(Thread.java:636)
2012-08-03 17:47:33,988 WARN org.apache.hadoop.hdfs.server.datanode.FSDatasetAsyncDiskService: AsyncDiskService has already shut down. 2012-08-03 17:47:33,989 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode