Where to create staging data table in BigData environment?

Question

I am currently having Hadoop-2, PIG, HIVE and HBASE. I have an inputdata. I have loaded that data in HDFS. I want to create staging data in this environment.

My query is -

In which BigData component, I should create Staging Table(Pig/HIVE/HBASE) ; this will have data coming in based on a condition? Later, we might want to run MapReduce Jobs with complex logic on it.

Please assist

Anywhere you want. Pig is not an option as it does not have a metastore. Hive if you want SQL Like queries. HBase based on your access patterns. — Venkat
Hello Venkat. Thanks for the reply.If I create it in HIVE and then i want to run mapReduce programs on top of it. How will I achieve that? And one more thing, what do you exactly means by access patterns? Any example? — user3343543

Venkat Venkat · Accepted Answer · 2015-07-15T15:31:58

Anywhere you want. Pig is not an option as it does not have a metastore. Hive if you want SQL Like queries. HBase based on your access patterns.

When you run a Hive query on top of data it is converted into MR.

When you create it in Hive use Hive Queries & not MR. If you are using MR then use Pig. You will not benefit creating a Hive table on top of data.

Where to create staging data table in BigData environment?

2 Answers