My job does the following things:
- Consumes events from Kafka topic based on event time.
- Computes a window size of 7 days and in a slide of 1 day.
- Sink the results to Redis.
I have several issues:
- In case it consumes Kafka events from the lastest record, after 1 day the job is alive, the job closes the window and computes 7 days window. The problem is that the job has only data for 1 day and hence the results are wrong.
- If I try to let it consumes the Kafka events from a timestamp of 7 days ago, as the job starts, it calculates the whole windows from the first day, and it took a lot of time. Also, I want just the last window results because this is what matters for me.
Have I missed something? Is there a better way to do that?