I have one 3 node hbase cluster running on amazon Ec2. Which is working perfectly fine. Now, I try to insert the data from EMR to EC2 using two separate insert queries. So first insert query works perfectly fine and insert the data and after that all of my region servers become dead. So, could you please suggest me general guidelines to debug this problem and why generally region servers become dead? Moreover, even i explicitly start the region servers after sometime again they become dead.
Update question :
Earlier i was thinking it might be a problem due to HBASE_HEAPSIZE
which is by default set to 1GB. But i also increased that to 5.5 Gb still region servers are becoming dead.
Below is the logs which i am getting on every region server after they are dead.
2013-10-07 18:16:27,949 WARN org.apache.zookeeper.ClientCnxn: Session 0x141916dfbe50000 for server null, unexpected error, closing socket connection and attempting rec$
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:597)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:286)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035)
2013-10-07 18:16:27,990 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: localhost/10.179.42.93:50020. Already tried 1 time(s).
2013-10-07 18:16:28,049 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server master/10.179.42.93:2181
2013-10-07 18:16:28,049 INFO org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not SASL-authenticate because the default JAAS configuration section 'Client'$
2013-10-07 18:16:28,049 WARN org.apache.zookeeper.ClientCnxn: Session 0x141916dfbe50001 for server null, unexpected error, closing socket connection and attempting rec$
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:597)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:286)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035)
2013-10-07 18:16:28,177 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server slave/10.178.5.52:2181
2013-10-07 18:16:28,177 INFO org.apache.zookeeper.client.ZooKeeperSaslClient: Client will not SASL-authenticate because the default JAAS configuration section 'Client'$
2013-10-07 18:16:28,178 WARN org.apache.zookeeper.ClientCnxn: Session 0x141916dfbe50001 for server null, unexpected error, closing socket connection and attempting rec$
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:597)
at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:286)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035)