I want to write Java program which reads input from HDFS, processes it using MapReduce and writes the output into a MongoDb.
Here is the scenario:
- I have a Hadoop Cluster which has 3 datanodes.
- A java program reads the input from the HDFS, processes it using MapReduce.
- Finally, write the result into a MongoDb.
Actually, reading from HDFS and processing it with MapReduce are simple. But I gets stuck about writing the result into a MongoDb. Is there any Java API supported to write the result into MongoDB? Another question is that since it is a Hadoop Cluster, so we don't know which datanode will run the Reducer task and generate the result, is it possible to write the result into a MongoDb which is installed on a specific server?
If I want to write the result into HDFS, the code will be like this:
@Override
public void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException, InterruptedException
{
long sum = 0;
for (LongWritable value : values)
{
sum += value.get();
}
context.write(new Text(key), new LongWritable(sum));
}
Now I want to write the result into a MongoDb instead of HDFS, how can I do that?