3
votes

If an element arrives that violates the watermark condition, how is the event handled? Is it thrown away? Or is the event still propagated downstream with past windowing functions recomputed with the late event?

The documentation acknowledges that late events happen, but doesn't explain how they are handled. https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/streaming/event_time.html

2

2 Answers

2
votes

By default, late elements are dropped when the watermark is past the end of the window. However, Flink allows to specify a maximum allowed lateness for window operators. Allowed lateness specifies by how much time elements can be late before they are dropped, and its default value is 0. Elements that arrive after the watermark has passed the end of the window but before it passes the end of the window plus the allowed lateness, are still added to the window. Depending on the trigger used, a late but not dropped element may cause the window to fire again. This is the case for the EventTimeTrigger.

In order to make this work, Flink keeps the state of windows until their allowed lateness expires. Once this happens, Flink removes the window and deletes its state.

you can check the lifecycle here. https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/windows.html#window-lifecycle

1
votes

As of Flink 1.0 late elements are handled by re-evaluating the window function with a "singleton" window that contains just the late event.

In future versions of Flink the user will have more control over this behavior. See this thread from the flink-dev mailing list:

http://mail-archives.apache.org/mod_mbox/flink-dev/201604.mbox/%3CCANMXwW3_Ew38KyL0q=q70pC03=UD=KaLQ0XmRyTNE77udAsh=w@mail.gmail.com%3E