1
votes

i'm currently working on a streaming ML pipeline and need exactly once event processing. I was interested by Flink but i'm wondering if there is any way to alter/update the execution state from outside.

The ml algorithm state is kept by Flink and that's ok, but considering that i'd like to change some execution parameters at runtime, i cannot find a viable solution. Basically an external webapp (in GO) is used to tune the parameters and changes should reflect in Flink for the subsequent events.

I thought about:

  • a shared Redis with pub/sub (as polling for each event would kill throughput)
  • writing a custom solution in Go :D
  • ...

The state would be kept by key, related to the source of one of the multiple event streams coming in from Kafka.

Thanks

1

1 Answers

1
votes

You could use a CoMapFunction/CoFlatMapFunction to achieve what you described. One of the inputs is the normal data input and on the other input you receive state changing commands. This could be easiest ingested via a dedicated Kafka topic.