Importing MySQL table into hadoop?

Question

I have one mysql table which takes 20-30 mins for responding to a simple select query due to its huge size. I was thinking if I can import it into hadoop for processing. Would it be a good idea? If I will use hive to perform the same query, there will be any performance gain?? If yes, then how should I import my table into hadoop? Since table has composite primary key, sqoop is not an option. One more thing, Should I use HBase? Which approach will be better?

Thanks

even if your database is big you will be selecting only 10,20 or 100 record at a time ,also you also may be using indexing — Arun Killu
@Arun yes i am using indexing and yes i select 10, 20 or 100 record at a time but it takes 20-30 mins. — cool.ernest.7

Praveen Sripati Praveen Sripati · Accepted Answer · 2013-02-07T14:05:36

Hive is batch oriented, consider using Impala which is used for interactive adhoc querying and is faster than Hive. Hive is much older than Impala, so it has lot of documentation and help can also be found easily.

There is also Apache Drill, but it is still in the incubator phase. There is also Amazon RedShift, but it's in a limited preview phase as of now. Then there is Google BigQuery. There are multiple options for getting data in Hadoop and processing data on top of it.

There is no fixed solution/choice for all the problems. Based on your requirement, the appropriate tool has to be chosen.

Importing MySQL table into hadoop?

3 Answers