As part of a workaround, I wanted to use two mapreduce jobs(instead of one) that ought to run in sequence for giving the desired affect.
The map function in each job simply emit each key,value pair without processing. The reduce functions in each job are different as they do different kind of processing.
I stumbled upon oozie and it seem to directly writes to the input stream of the consequent job (or doesn't it?) - this would be great since the intermediate data is large (I/O operation would become a bottleneck).
How can I achieve this with oozie (2 mr jobs in the workflow)?
I did go through the below resources, but they simply run a single job as a workflow: https://cwiki.apache.org/confluence/display/OOZIE/Map+Reduce+Cookbook
Help appreciated.
Cheers