1
votes

Error while creating table in HBASE. "ERROR: java.io.IOException: Table Namespace Manager not ready yet, try again later."

hcbk -fix shows ERROR: hbase:meta is not found on any region.

Error appeared after fresh start of hbase shell session No errors reported in master log during start. Last session of Hbase closed properly but not zookeeper (suspecting this as a reason for meta table corruption).

I am able to list the tables created earlier

hbase(main):001:0> list
TABLE
IDX_STOCK_SYMBOL
Patient
STOCK_SYMBOL
STOCK_SYMBOL_BKP
SYSTEM.CATALOG
SYSTEM.FUNCTION
SYSTEM.SEQUENCE
SYSTEM.STATS
8 row(s) in 1.7930 seconds

Creating a table named custmaster

    hbase(main):002:0> create 'custmaster', 'customer'

    ERROR: java.io.IOException: Table Namespace Manager not ready yet, try 
    again later
    at org.apache.hadoop.hbase.master.HMaster.getNamespaceDescriptor(HMaster.java:3179)
    at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1735)
    at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1774)
    at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:40470)
    at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
    at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
    at org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:74)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

Work Around: Running hbck to identify inconsistencies

    [hduser@master ~]$ hbase hbck
    >Version: 0.98.4-hadoop2
    >Number of live region servers: 2
    >Number of dead region servers: 0
    >Master: master,60000,1538793456542
    >Number of backup masters: 0
    >Average load: 0.0
    >Number of requests: 11
    >Number of regions: 0
    >Number of regions in transition: 1
    >
    >ERROR: META region or some of its attributes are null.
    >ERROR: hbase:meta is not found on any region.
    >ERROR: hbase:meta table is not consistent. Run HBCK with proper fix options to fix hbase:meta inconsistency. Exiting...
    .
    .
    .
    >Summary:                >
    >3 inconsistencies detected.
    >Status: INCONSISTENT

Ran hbck wih-details option to identify the tables involved

    [hduser@master ~]$ hbase hbck -details
    >ERROR: META region or some of its attributes are null.
    >ERROR: hbase:meta is not found on any region.
    >ERROR: hbase:meta table is not consistent. Run HBCK with proper fix options to fix hbase:meta inconsistency. Exiting...
    >Summary:
    >3 inconsistencies detected.
    >Status: INCONSISTENT

The output of -details clearly shows the meta is not found on any region.

Tried running the command hbase hbck -fixMeta but same result returned as above Hence tried hbase hbck -fix

This command ran for sometime with the prompt "Trying to fix a problem with hbase:meta.." and resulted in below error

    [hduser@master ~]$ hbase hbck -fix

    Version: 0.98.4-hadoop2
    Number of live region servers: 2
    Number of dead region servers: 0
    Master: master,60000,1538793456542
    Number of backup masters: 0
    Average load: 0.0
    Number of requests: 19
    Number of regions: 0
    Number of regions in transition: 1
    ERROR: META region or some of its attributes are null.
    ERROR: hbase:meta is not found on any region.
    Trying to fix a problem with hbase:meta..
    2018-10-06 09:01:03,424 INFO  [main] client.HConnectionManager$HConnectionImplementation: Closing master protocol: MasterService
    2018-10-06 09:01:03,425 INFO  [main] client.HConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x166473bbe720005
    2018-10-06 09:01:03,432 INFO  [main] zookeeper.ZooKeeper: Session: 0x166473bbe720005 closed
    2018-10-06 09:01:03,432 INFO  [main-EventThread] zookeeper.ClientCnxn: EventThread shut down
    Exception in thread "main" org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=35, exceptions:
    Sat Oct 06 08:52:13 IST 2018, org.apache.hadoop.hbase.client.RpcRetryingCaller@18920cc, org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.PleaseHoldException): org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
            at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2416)
            at org.apache.hadoop.hbase.master.HMaster.assignRegion(HMaster.java:2472)
            at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:40456)
            at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
            at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
            at org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:74)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:748)

    Sat Oct 06 08:52:13 IST 2018, org.apache.hadoop.hbase.client.RpcRetryingCaller@18920cc, org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.PleaseHoldException): org.apache.hadoop.hbase.PleaseHoldException: Master is initializing
            at org.apache.hadoop.hbase.master.HMaster.checkInitialized(HMaster.java:2416)
            at org.apache.hadoop.hbase.master.HMaster.assignRegion(HMaster.java:2472)
            at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:40456)
            at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2027)
            at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:98)
            at org.apache.hadoop.hbase.ipc.FifoRpcScheduler$1.run(FifoRpcScheduler.java:74)
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
            at java.util.concurrent.FutureTask.run(FutureTask.java:266)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
            at java.lang.Thread.run(Thread.java:748)

Help me on how to resolve this issue? Thanks in advance !!

1

1 Answers

1
votes

I havenot checked on NameNode and Datanode logs. But when I check the real issue turned out was corrupt file in HDFS.

Ran hadoop fsck / to check health of the file system.

    [hduser@master ~]$ hadoop fsck /
    DEPRECATED: Use of this script to execute hdfs command is deprecated.
    Instead use the hdfs command for it.

    18/10/06 09:52:00 WARN util.NativeCodeLoader: Unable to load native-hadoop libr
    ary for your platform... using builtin-java classes where applicable
    Connecting to namenode via http://master:50070/fsck?ugi=hduser&path=%2F
    FSCK started by hduser (auth:SIMPLE) from /192.168.1.11 for path / at Sat Oct 0
    6 09:52:02 IST 2018
    ...............................................................................
    ..
    /user/hduser/hbase/.hbck/hbase-1538798774320/data/hbase/meta/1588230740/info/35
    9783d4cd07419598264506bac92dcf: CORRUPT blockpool BP-1664228054-192.168.1.11-15
    35828595216 block blk_1073744002

    /user/hduser/hbase/.hbck/hbase-1538798774320/data/hbase/meta/1588230740/info/35                                                   9783d4cd07419598264506bac92dcf: MISSING 1 blocks of total size 3934 B.........
    /user/hduser/hbase/data/default/IDX_STOCK_SYMBOL/a27db76f84487a05f3e1b8b74c13fa
    78/0/c595bf49443f4daf952df6cdaad79181: CORRUPT blockpool BP-1664228054-192.168.
    1.11-1535828595216 block blk_1073744000

    /user/hduser/hbase/data/default/IDX_STOCK_SYMBOL/a27db76f84487a05f3e1b8b74c13fa
    78/0/c595bf49443f4daf952df6cdaad79181: MISSING 1 blocks of total size 1354 B...
    .........
    ...
    /user/hduser/hbase/data/default/SYSTEM.CATALOG/d63574fdd00e8bf3882fcb6bd53c3d83
    /0/dcb68bbb5e394d19b06db7f298810de0: CORRUPT blockpool BP-1664228054-192.168.1.
    11-1535828595216 block blk_1073744001

    /user/hduser/hbase/data/default/SYSTEM.CATALOG/d63574fdd00e8bf3882fcb6bd53c3d83
    /0/dcb68bbb5e394d19b06db7f298810de0: MISSING 1 blocks of total size 2283 B.....                                                   ......................Status: CORRUPT
     Total size:    4232998 B
     Total dirs:    109
     Total files:   129
     Total symlinks:                0
     Total blocks (validated):      125 (avg. block size 33863 B)
      ********************************
      UNDER MIN REPL'D BLOCKS:      3 (2.4 %)
      dfs.namenode.replication.min: 1
      CORRUPT FILES:        3
      MISSING BLOCKS:       3
      MISSING SIZE:         7571 B
      CORRUPT BLOCKS:       3
      ********************************
     Minimally replicated blocks:   122 (97.6 %)
     Over-replicated blocks:        0 (0.0 %)
     Under-replicated blocks:       0 (0.0 %)
     Mis-replicated blocks:         0 (0.0 %)
     Default replication factor:    2
     Average block replication:     1.952
     Corrupt blocks:                3
     Missing replicas:              0 (0.0 %)
     Number of data-nodes:          2
     Number of racks:               1
    FSCK ended at Sat Oct 06 09:52:02 IST 2018 in 66 milliseconds


    The filesystem under path '/' is CORRUPT

And I run hdfs hbck -delete option to delete the corrupt files and fixed the issue.

Detailed explanation on cleaning hdfs f/s is available here --> How to fix corrupt HDFS FIles