I have a setup of 3 node Hadoop cluster(Apache Hadoop-2.8.0). I have deployed 2 namenodes that are configured in HA mode using QJM. 2 datanodes are configured on the same machine where namenode are installed. 3rd node is used for quorum purpose only.
Setup
Node1 { nn1, dn1, jn1, zkfc1, zkServer1 }
Node2 -> {nn2, dn2, jn2, zkfc2, zkServer2}
Node3 -> {jn3, zkServer3}
I stopped the cluster for some reason(power recycled the servers) and since them I am not able to start the cluster successfully. After examining the logs I found that the namenodes are in safe mode and none of them are able to load the block in memory. Below is the status of namenode from namenode UI.
Safe mode is ON. The reported blocks 0 needs additional 6132675 blocks to reach the threshold 0.9990 of total blocks 6138814. The number of live datanodes 0 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached. 61,56,984 files and directories, 61,38,814 blocks = 1,22,95,798 total filesystem object(s). Heap Memory used 5.6 GB of 7.12 GB Heap Memory. Max Heap Memory is 13.33 GB. Non Heap Memory used 45.19 MB of 49.75 MB Committed Non Heap Memory. Max Non Heap Memory is 130 MB.
There were numerous JVM Pause messages in name-node logs so I have tried increasing the HADOOP_HEAPSIZE, increasing the heap size in HADOOP_NAMENODE_OPTS but no success.
Need help..