5
votes

Want to understand best practices for handling the exceptions in Mapper / Reducer.

Option 1: Not to have any try/catch and let the task fail and MR will retry the task which eventually terminate the job. Property mapreduce.map/reduce.maxattempts plays role here.

Option 2: Use counters to record number of failures in catch block. And based on some threshold value of these errors either kill the job or just use the counters to show number of failed records.

Any (other) common/standard practices for handling exceptions in map-reduce?

1

1 Answers

2
votes

Options 1 and 2 listed are some of ways we are handling in our project. Please have a look at here. It lists few more options