0
votes

How to import data from mysql to HDFS. I can't use sqoop as it's a HDFS installation not cloudera. I used below link to setup HDFS. My hadoop version is 0.20.2 http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

2
I don't see what's stopping you using Sqoop as it's not in any way tied to Cloudera specific software. - Lars Francke
Can you please guide me as in how to configure Sqoop? - Ahmad Osama
I was able to do it by installing HIVE and than importing txt files into HDFS using HIVE. .. thanks all - Ahmad Osama

2 Answers

1
votes

Not directly related to your question, but if you want to use the database as input to a Map Reduce job, and don't want to copy to HDFS, you could use the DBInputFormat to input directly from the database.

0
votes

Apart from sqoop, you could try hiho. I have heard good things about it. (Never used it though)

But mostly what I have seen is people end up writing their own flows to do this. If hiho doesn;t work out, you can dump data from MySql using mysqlimport. Then load into HDFS using a map-reduce job or Pig/Hive.

I have heard Sqoop is pretty good and is widely used (This is hearsay again, I have never used it myself). Now that it is an apache incubator project, I think it might have started supporting apache releases of hadoop, or at least might have made it less painful for non-cloudera versions. The doc does say that it support Apache hadoop v0.21. Try to make it work with your hadoop version. It might not be that difficult.