1
votes

I am trying to load the hbase table from hive table, for that I am using the following approach and it works fine if I have only single column family in hbase table, however if I have multiple families it throws error.

Approach

source table

CREATE EXTERNAL TABLE temp.employee_orc(id String, name String, Age int)
STORED AS ORC
LOCATION '/tmp/employee_orc/table';

Create Hive table with Hbase Serde

CREATE TABLE temp.employee_hbase(id String, name String, age int)
    STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
    WITH SERDEPROPERTIES ('hbase.columns.mapping' = ':key,emp:name,emp:Age')
    TBLPROPERTIES("hbase.table.name" = "bda:employee_hbase", "hfile.family.path"="/tmp/employee_hbase/emp", "hive.hbase.generatehfiles"="true");

export the hbase files

SET hive.hbase.generatehfiles=true;
INSERT OVERWRITE TABLE temp.employee_hbase SELECT DISTINCT id, name, Age FROM temp.employee_orc CLUSTER BY id;

Load the hbase table

export HADOOP_CLASSPATH=`hbase classpath`
hadoop jar /usr/hdp/current/hbase-client/lib/hbase-server.jar completebulkload /tmp/employee_hbase/ 'bda:employee_hbase'

Error I am getting following error if I have multiple column family in Hbase table,

java.lang.RuntimeException: Hive Runtime Error while closing operators: java.io.IOException: Multiple family directories found in hdfs://hadoopdev/apps/hive/warehouse/temp.db/employee_hbase/_temporary/0/_temporary/attempt_1527799542731_1180_r_000000_0

is there another way to load Hbase table if not this approach?

2

2 Answers

0
votes

Bulk load from hive to hbase, The target table can only have a single column family.

bulk load of hbase

0
votes
  1. You can use hbase bulkload hbase_bulkload with support multiple column family
  2. Or you can use multiple hive table for each column family