When our standby namenode based on hadoop v2.4.1 restarted from failure, we found that the namenode was too busy to response timely after it leaving safemode. We dumped several stacks, all of them looked like this,
Thread 212 (IPC Server handler 148 on 8020): State: WAITING Blocked count: 66 Waited count: 598 Waiting on java.util.concurrent.locks.ReentrantLock$FairSync@60ea5634 Stack: sun.misc.Unsafe.park(Native Method) java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:867) java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1197) java.util.concurrent.locks.ReentrantLock$FairSync.lock(ReentrantLock.java:229) java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290) org.apache.hadoop.hdfs.server.namenode.FSNamesystem.writeLock(FSNamesystem.java:1378) org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1676) org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:1019) org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.blockReport(DatanodeProtocolServerSideTranslatorPB.java:152) org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:28061) org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585) org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) java.security.AccessController.doPrivileged(Native Method) javax.security.auth.Subject.doAs(Subject.java:415) org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1556) org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
Almost all server handlers are waiting for the FSNameSystem#writeLock to process incremental/full reports!
Conf:
dfs.blockreport.initialDelay: 120.
dfs.blockreport.intervalMsec: 6h.
server handlers number: 200.
datanodes number:400.
The namenode takes 0.5~1s to process a block report.
the log of datanodes see a lot offerService IOExceptions and retries.
The NN log indicated that a storage of a datanode had been processed for more than twice, some even up to 10.
blockLog.info("BLOCK* processReport: from storage " + storage.getStorageID()
+ " node " + nodeID + ", blocks: " + newReport.getNumberOfBlocks()
+ ", processing time: " + (endTime - startTime) + " msecs");
Has anyone seen the same problem, Any ideas?