I have a Flink streaming application that need the ability to 'pause' and 'unpause' processing on a specific keyed stream. 'Processing' means just performing some simple anomaly detection on the stream.
The flow we are thinking about works like this:
Stream of Commands, either ProcessCommand, PauseCommand, or ResumeCommand, each with an id that is used to KeyBy.
ProcessCommands will check if the key is paused before being processed, and buffer if not.
PauseCommands will pause the processing of a key.
ResumeCommands will unpause the processing of a key and flush the buffer through.
Does this flow seem reasonable and, if so, would I be able to use something like the split operator to achieve?
Example stream, individual record timestamps ommitted:
[{command: process, id: 1}, {command: pause, id: 1}, {command: process, id: 1}, {command: resume, id: 1}, {command: process, id: 1}]
Flow:
=>
{command: process, id: 1} # Sent downstream for analysis
=>
{command: pause, id: 1} # Starts the buffer for id 1
=>
{command: process, id: 1} # Buffered into another output stream
=>
{command: resume, id: 1} # Turns off buffering, flushes [{command: process, id: 1}] downstream
=>
{command: process, id: 1} # Sent downstream for processing as the buffering has been toggled off
id) will receive the commands as a part of the record and the stream processing will be controlled using these commands right? - shriyogidand acommand. - austin_ceprocessandpausecommands? Can you please give an example showcasing the input command stream and expected behavior? - shriyog