We plan to use Apache Flink to perform real-time aggregations on multiple types of objects. We need to support several types of aggregations such as sum, max, min, average etc. - nothing special so far
Our requirement is to output the data to kafka where one message contains multiple aggregated values for multiple object attributes.
for example, the message should include the sum, max, and average values for attribute A and also the sum and min values of attribute B for the last 10 minutes
My question is what is the best way to implement such a requirement with Flink?
We though about using a custom window function that will run on all objects at the end of the window and calculate by itself all required values and output a new object that holds all of these aggregated values. The thing we are concerned about with this solution is the affect on the memory consumption having to hold all the window data in memory waiting for the window to fire (we will have many such windows opened at the same time)
Any suggestions / comments are highly appreciated!
Thanks