0
votes

I'm using hadoop 1.2 , hbase 0.94.8 and hive 0.14 . I'am trying to insert data into a hbase table using hive. I have already created the table:

CREATE TABLE hbase_table_emp(id int, name string, role string) 
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf1:name,cf1:role")
TBLPROPERTIES ("hbase.table.name" = "emp");

and load data into another table that I will overwrite it into the hbase table :

hive> create table testemp(id int, name string, role string) row format delimited fields terminated by '\t';
hive> load data local inpath '/home/user/sample.txt' into table testemp;

Now,I'm trying to overwrite it into hbase table:

When I do :

hive> insert overwrite table hbase_table_emp select * from testemp;

I get this error:

hive> insert overwrite table hbase_table_emp select * from testemp;
Query ID = hduser_20150126005151_ebc2a36f-97c4-41da-b145-32d5732d9681
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.lang.NoClassDefFoundError: org/cliffc/high_scale_lib/Counter
    at org.apache.hadoop.hive.hbase.HBaseStorageHandler.configureJobConf(HBaseStorageHandler.java:470)
    at org.apache.hadoop.hive.ql.plan.PlanUtils.configureJobConf(PlanUtils.java:856)
    at org.apache.hadoop.hive.ql.plan.MapWork.configureJobConf(MapWork.java:544)
    at org.apache.hadoop.hive.ql.plan.MapredWork.configureJobConf(MapredWork.java:68)
    at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:370)
    at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:137)
    at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
    at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
    at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1604)
    at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1364)
    at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1177)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
    at org.apache.hadoop.hive.ql.Driver.run(Driver.java:994)
    at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
    at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
    at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
    at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
    at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:622)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.lang.ClassNotFoundException: org.cliffc.high_scale_lib.Counter
    at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:323)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:268)
    ... 24 more
FAILED: Execution Error, return code -101 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask. org/cliffc/high_scale_lib/Counter

Could someone help me please?

2

2 Answers

2
votes

I found a solution to this porblem! First of all I changed to hadoop2.4 /hbase0.98 /hive 0.14: in hive-env.sh I did:

export HIVE_AUX_JARS_PATH=/usr/local/hbase/lib

and in the hive shell:

hive> add jar /usr/local/hive/lib/hive-hbase-handler-0.14.0.jar;  
hive> add jar /usr/local/hbase/lib/hbase-common-0.98.0-hadoop2.jar;
hive> add jar /usr/local/hbase/lib/zookeeper-3.4.5.jar; 
hive> add jar /usr/local/hbase/lib/guava-12.0.1.jar;
hive> add jar /usr/local/hbase/lib/high-scale-lib-1.1.1.jar; 

and this worked out for me :)

0
votes

It sounds like at runtime Hive isn't able to resolve the HBase home directory (and associated libraries). Can you verify that the following environment variables are set (your actual values may vary of course, depending on where Hive and HBase are installed)?

HBASE_IDENT_STRING=hbase
HBASE_CONF_DIR=/etc/hbase/conf
HBASE_HOME=/usr/lib/hbase
HBASE_LOG_DIR=/var/log/hbase
HIVE_HOME=/usr/lib/hive
HBASE_PID_DIR=/var/run/hbase

I'm not certain whether all of the above environmental settings have to be defined for what you are doing to work, but my intuition is that at a minimum HBASE_HOME and HBASE_CONF_DIR must be set.

For reference, the above environment variables are ones that the HDP 2.1 distribution sets for you at Hive runtime. I was able to perform the Hive-to-HBase persistence you were attempting with a minimal dataset, so hopefully the solution lies with those environmental settings.