0
votes

I'm learning how to process streaming data with Flink.

I've succeeded in coding an example, which is to receive and deserialize streaming data from a data source, to transform it and print the output.

Now I'm thinking how to process the exception of OOM in Flink.

For example, if there exists some backpressure issue, meaning that if the speed of sending data from the data source is faster than processing data in the operators of Flink, as my understanding, the RAM will be exhausted in some time. So what if this case happens? How to handle this kind of exception? Is it possible to ignore some input so that the process won't cause any error?

In other words, I'm expecting some mechanism as below:

if (RAM is almost exhausted)
    ignore the coming data
else
    process the coming data
1

1 Answers

0
votes

The mechanism you imagine does not exist. You could build it yourself, but it feels like the wrong way to address the problem.

Backpressure will not cause OOM exceptions in Flink. Its network stack uses a fixed-size pool of off-heap network buffers along with credit-based flow control. A task cannot send data downstream unless it has already been allocated a buffer in the receiver. This means that the data sources quickly adapt to the capacity of the slowest task(s) in the pipeline. So rather than ignoring the incoming data, the sources naturally throttle themselves and avoid reading data they can't send downstream.

The only likely cause of OOM errors is that over time your application uses more and more keyed state and timers. You can address this in a number of ways:

  • use the RocksDB state backend (which keeps state on the local disk, plus an off-heap cache)
  • use pre-aggregation wherever possible
  • be more aggressive about cleaning up state for stale keys