0
votes

I can create a hive table like this which takes data from hbase:

CREATE EXTERNAL TABLE app_store_data
(key string,
type string,
name string,
country string,
price float)
STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
WITH SERDEPROPERTIES 
("hbase.columns.mapping" = ":key,cf:_type,cf:name, cf:country, cf:price")
TBLPROPERTIES ("hbase.table.name" = "DEBUG_items_app_store");

However, my hbase table contains two types of items, one is 'apps' and other is 'reviews', a key called _type defines which type of item it is. I want to create two separate external tables in hive from same hbase table, one which will take rows with _type = 'review' and other which will take rows which have _type = 'app'. How do i go about doing this?

1

1 Answers

0
votes

If I understand you correctly you can't do it on the fly. hive-hbase handler doesn't provide any such DDL feature to apply filter.
Possibly you could create your own solution to achieve it:

  • Add a TBLPROPERTIES in DDL to categorize type of records, particular table point to- _type(apps, reviews). e.g. TBLPROPERTIES('record.type','_apps')
  • Download hive hbase handler source code, set start and end row key or rowkey filter to HBaseScanRange.java inside setup method. Extract filter criteria from TBLPROPERTIES. Read custom serde for more details.
  • Use your custom hive hbase handler.

Otherwise, create managed tables-

create table app_store_data_apps as 
select * from app_store_data where key like '%_apps';

create table app_store_data_reviews as 
select * from app_store_data where key like '%_reviews';