5
votes

I am looking to migrate from a homegrown streaming server to Apache Flink. One thing that we have is a Apache Storm like DRPC interface to run queries against the state held in the processing topology.

So for example: I have a bunch of sensors that I am running an moving average on. I want to run a query on the topology and return all the sensors where that average is above a fixed value.

Is there an equivalent in Flink, or if not, what is the best way to achieve equivalent functionality?

2

2 Answers

4
votes

Out-of-box Flink does not come with a solution for querying the internal state of operations right now. You're lucky however, because there are two solutions: We did an example of a stateful word count example that allows querying the state. This is available here: https://github.com/dataArtisans/query-window-example

For one of the upcoming versions of Flink we are also working on a generic solution to the queryable state use case. This will allow querying the state of any internal operation.

Also, could it also suffice, in your case, to just periodically output the values to something like Elasticsearch using a Window Operation. The results could then simply be queried from Elasticsearch.

2
votes

They are coming with Out-of-box solution called Queryable State in next release. Here is an example
https://github.com/apache/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/query/QueryableStateITCase.java

But I suggest you should read about it more first then see the example.