1
votes

Hi I have a hive table on HBASE that has 200gb of records . I am running simple hive query to fetch 20 gb records . But this takes around 4 hours of time . I can not create partition on HIVE table cause it is integrated on HBASE.

Please suggest any idea to improve performance

This is my HIVE query

INSERT OVERWRITE LOCAL DIRECTORY '/hadoop/user/m6034690/FSDI/FundamentalAnalytic/FundamentalAnalytic_2014.txt'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
STORED AS TEXTFILE
select * from hbase_table_FundamentalAnalytic  where FilePartition='ThirdPartyPrivate' and FilePartitionDate='2014'; 
1

1 Answers

0
votes

If you can, then I think Apache Phoenix will speed things up.

https://phoenix.apache.org/faq.html

Very simple and intuitive to use and super fast.