0
votes

I am not sure about what hadoop can and cannot do, and how easy things are.

I understand hadoop is good at doing mapreduce jobs and at providing hdfs, their distributed filesystem.

What else is hadoop good at / easy to use ?

My problem : I would like to serve data, result of mapreduce. And as I have lot of traffic I would need 3 front end servers. Can Hadoop help me deploy a server on 3 of my n runnning nodes ?

Basically instead of running mapreduce on n machines, I would like to run a custom executable (my server) on 3 machines. And when 1 machine fails, that hadoop takes care of starting the job on another available machine.

Am I supposed to run that on the hadoop cluster ? or should the hadoop cluster be used only for the mapreduce and I should have a separate cloud to serve the data from the hadoop cluster ?

Thanks for sharing your experience.

P.S I am just considering hadoop right now as a solution, Im not tied to it

1
Please, try to be more specific, because it is not clear what exactly you need. - vefthym
@vefthym I added more explanations - Thomas

1 Answers

0
votes

Your question isn't actually clear but here is my shot.

You want to display the result of your Hadoop job? Usually a Hadoop job writes its result to HDFS. What you can do is to create your own OutputFormat class. You might define a XMLOutputFormat for example.

But the nice thing is that you can create your own Writable. Take a look at Database Access with Apache Hadoop. In this tutorial you can save the output of a Hadoop job to a data base system.

Your frontend then can query the database and show the result.