4
votes

I have an asynchronous application that saves data to HBase using async-hbase-client.

My HBase version is: 1.0.0-cdh5.6.0 My async HBase client version is: 0.9.0

The application ran fine for a while (~4 or 5 days) but during the weekend it started failing with the following exception:

org.apache.hadoop.hbase.NotServingRegionException: Region pageviews,,1463568860289.298bb29bbd148a0a62ec90885ef8d027. is not online on //some address here
        at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2786)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:922)
        at org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:1965)
        at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32203)
        at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2034)
        at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)
        at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
        at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
        at java.lang.Thread.run(Thread.java:745)

What I did:

  • I verified that HBase is indeed functioning by going into the HBase shell and playing with some tables there which worked fine
  • I logged into the zookeeper shell by running ./hbase zkcli and then running: rmr /hbase/root-region-server - I did this according to this link: http://rogueleaderr.com/post/32963921889/hbase-error-region-is-not-online-root-0. This did not work with the following error: Node does not exist: /hbase/root-region-server. I tried resolving this problem and encountered this solution suggestion: HBase: /hbase/meta-region-server node does not exist. So instead I ran: rmr /hbase/meta-region-server which didn't do anything (at least it didn't write anything)
  • I tried looking for other solutions, some suggested it has something to do with incompatible HBase versions which might sound right but it did work for several days without any issues so I'm wondering what exactly is the problem

If anyone has any ideas on what exactly is the problem here I'll appreciate it. Currently I'm kind of in the dark here

Thanks

1
Were you able to fix this ? below answer helped ?Ram Ghadiyaram
I just saw the answer :) so I didn't have time to check it out. I'll obviously update after I try itGideon

1 Answers

5
votes

Seems like one particular table and its region got corrupted(you are able to access other tables from hbase shell as you described), please try hbase hbck on specific table name which may fix this

Other Option : you could solve this by increasing number of threads required to open the regions so that meta regions can be assigned even threads for local index table is still waiting to remove the deadlock.

<property> <name>hbase.regionserver.executor.openregion.threads</name> <value>100</value> </property>