0
votes

I have a Hive table created with following properties:

create external table statdata(uid int, user string, loc string, locweather int)
stored as textfile
row format delimited
fields terminated by ','
location '/hive/data/weatherstats’;

The Hive table contains 5 rows which I had manually inserted.

hive> select * from statdata;
OK
1 john newyork 33
2 rob london 32
3 stan delhi 45
4 fred tokyo 38
5 phil beijing 47

I created a new HBase table: hbstat with only one column family: weather as below:

create 'hbstat', 'weather'

I want to pull the existing data from my hive table: statdata into the new hbase table: hbstat There is an option to map the new inserts of Hive to Hbase using Hbase storagehandler as below:

CREATE TABLE foo(rowkey STRING, a STRING, b STRING)
STORED BY ‘org.apache.hadoop.hive.hbase.HBaseStorageHandler’
WITH SERDEPROPERTIES (‘hbase.columns.mapping’ = ‘:key,f:c1,f:c2’)
TBLPROPERTIES (‘hbase.table.name’ = ‘bar’);

But this works for the data at the time of inserting into hive tables and reflects in hbase at the same time on hbase also. Is there a way to get the older/existing data from hive tables into a newly created Hbase tables ?

1

1 Answers

0
votes

I will offer you a simple solution that you can do in two steps :

I- first : export data from your table 'Hive' to HDFS using insert command

hive> INSERT OVERWRITE DIRECTORY '/path_to_hdfs_dir/hdfs_out' SELECT * FROM statdata;

you can visualize the name_of_your_file by executing the following command :

hive> dfs -ls /path_to_hdfs_dir/hdfs_out/; 

NOTE: the name_of_your_file must be something like that : 00000_0

II- second: import your data from HDFS to Hbase table using Importtsv command

/opt/ibm/biginsights/hbase/bin/hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv.separator=',' -Dimporttsv.columns=HBASE_ROW_KEY,weather:user,weather:loc,weather:locweather hbstat /path_to_hdfs_dir/hdfs_out/name_of_your_file

NOTE: '/opt/ibm/biginsights/hbase/bin/hbase' the path where is my hbase shell.