What I understand so far is that there 3 ways of dealing with late data in Flink :
Dropping Late Events ( which is the default behavior for event-time window operators. Hence, a late arriving element will not create a new window.)
Redirecting Late Events (Late events can also be redirected into another DataStream using the side-output feature)
Updating Results by Including Late Events (recompute an incomplete result and emit an update)
I don't quite well understand what happen for late event for non-window operator, especially when the timestamp is assigned at the source. Here I have a FlinkKafkaConsumer :
new FlinkKafkaConsumer(
liveTopic,
deserializer,
config.toProps
).setStartFromTimestamp(startOffsetTimestamp)
.assignTimestampsAndWatermarks(
WatermarkStrategy
.forBoundedOutOfOrderness[String](Duration.ofSeconds(20))
)
If some data is out-of-order inside my Kafka partition, let's say 1 minute late in term of timestamp attached to a record, will this data be discarded when consumed by Flink ? Can I configure some kind of allowedLatness (like with window operator) ?