I have a flink job that takes in Kafaka topics and goes through a bunch of operators. I'm wondering what's the best way to deal with exceptions that happen in the middle.
My goal is to have a centralized place to handle those exceptions that may be thrown from different operators and here is my current solution:
Use ProcessFunction
and output sideOutput
to context
in the catch block, assuming there is an exception, and have a separate sink function for the sideOutput
at the end where it calls an external service to update the status of another related job
However, my question is that by doing so it seems I still need to call collector.collect()
and pass in a null value in order to proceed to following operators and hit last stage where sideOutput
will flow into the separate sink function. Is this the right way to do it?
Also I'm not sure what actually happens if I don't call collector.collect()
inside a operator, would it hang there and cause memory leak?