First of all, I am new to stream processing frameworks. I would like to benchmark some of them so I've started with Flink.
For my use case, I need to compare events from a window t with events from the window t-1, both of size 15 minutes, and then do some aggregations.
Here is a simplified version of my use case:
We consider the analyzed events as a tuple of the form . In window 1 we have: (A,1), (B,2), (C,3) and in window 2 we have: (D,6) and (B,7). Then, I need to compare events from the current window with those from the previous window and keep those verifying the following condition: Win2(K) - Win1(K) > 5. So with the previous example we get (B,5). (If there are 2 events with the same key, I need to sum them.)
I don't really know how to keep both of the windows in memory. I was thinking of making a tumbling window of 15 minutes (window t) and a 30 minute sliding window that slides by 15 minutes and doing a minus operation on them to compute window t-1.
Is this a good solution or is there a better way to do it?