5
votes

I have created a HBase table like the following:

create 'nancy', 'cf'

And created a table in Hive like below:

create external table nancy( id int, name string)

stored by 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'

WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key, cf:name")

TBLPROPERTIES("hbase.table.name"="nancy");

Am I mapping it write? What does the Key in the "hbase.columns.mapping" signifies?

Can anyone explain this mapping?

3

3 Answers

5
votes

Are you facing any specific problem?Query looks OK to me.

key represents that you are using this field as the table key. Remember Every field in a hive table can be mapped to one of these :

  • table key (using :key as selector)
  • column family (cf:) (MAP fields in Hive
  • column

In response to your comments :

hive> CREATE EXTERNAL TABLE hbase_table_2(key int, name string)

    > STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
    > WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:name")
    > TBLPROPERTIES("hbase.table.name" = "nancy");
OK
Time taken: 5.106 seconds

hive> select * from hbase_table_2;
OK
Time taken: 0.077 seconds

hive> INSERT OVERWRITE TABLE hbase_table_2 SELECT * FROM demo WHERE id=1;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_201308011237_0003, Tracking URL = http://localhost:50030/jobdetails.jsp?jobid=job_201308011237_0003
Kill Command = /Users/miqbal1/hadoop-eco/hadoop-1.1.2/libexec/../bin/hadoop job  -kill job_201308011237_0003
Hadoop job information for Stage-0: number of mappers: 1; number of reducers: 0
2013-08-01 16:29:21,832 Stage-0 map = 0%,  reduce = 0%
2013-08-01 16:29:23,843 Stage-0 map = 100%,  reduce = 0%
2013-08-01 16:29:24,849 Stage-0 map = 100%,  reduce = 100%
Ended Job = job_201308011237_0003
1 Rows loaded to hbase_table_2
MapReduce Jobs Launched: 
Job 0: Map: 1   HDFS Read: 256 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
OK
Time taken: 8.392 seconds
hive> 

And this is my HBase table :

hbase(main):017:0> scan 'nancy'
ROW                                      COLUMN+CELL                                                                                                          
 1                                       column=cf:name, timestamp=1375354762803, value=tariq                                                                 
1 row(s) in 0.0300 seconds
1
votes

Problem was with the whitespace character in columns mapping ":key, cf:name". Hive looks for a column family named " cf" instead of "cf". Sİnce there is no space in Tariq's response (":key,cf:name"), it works correctly.

0
votes

you have two choice that map hive table with hbase table :

  1. create a table that both hive and hbase can manager(e. delete )

    CREATE TABLE hbase_table_1(key int, name string)
    STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
    WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:name")
    

    TBLPROPERTIES("hbase.table.name" = "nancy");

  2. create an exteral table that managered by hbase

    CREATE EXTERNAL TABLE hbase_table_2(key int, name string) STORED BY
    'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
    WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,cf:name") 
    TBLPROPERTIES("hbase.table.name" = "nancy");
    

In both way you can inser by hive sql :

    insert into  hbase_table_1 select  1, "name1" ;
    insert into  hbase_table_2 select  2, "name2" ;

hbase(main):011:0> scan 'nancy'

ROW COLUMN+CELL
1 column=cf:name, timestamp=1491979916489, value=name1
2 column=cf:name, timestamp=1491979928355, value=name2
2 row(s) in 0.3250 seconds