I'm running into some troubles understanding the semantics around event time windowing. The following program generates some tuples with timestamps that are used as event time and does a simple window aggregation. I would expect the output to be in the same order as the input, but the output is ordered differently. Why is the output out of order with respect to event time?
import java.util.concurrent.TimeUnit
import org.apache.flink.streaming.api.TimeCharacteristic
import org.apache.flink.streaming.api.windowing.time.Time
import org.apache.flink.streaming.api.scala._
object WindowExample extends App {
val env = StreamExecutionEnvironment.getExecutionEnvironment
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
env.getConfig.enableTimestamps()
env.setParallelism(1)
val start = 1449597577379L
val tuples = (1 to 10).map(t => (start + t * 1000, t))
env.fromCollection(tuples)
.assignAscendingTimestamps(_._1)
.timeWindowAll(Time.of(1, TimeUnit.SECONDS))
.sum(1)
.print()
env.execute()
}
The input:
(1449597578379,1)
(1449597579379,2)
(1449597580379,3)
(1449597581379,4)
(1449597582379,5)
(1449597583379,6)
(1449597584379,7)
(1449597585379,8)
(1449597586379,9)
(1449597587379,10)
Result:
[info] (1449597579379,2)
[info] (1449597581379,4)
[info] (1449597583379,6)
[info] (1449597585379,8)
[info] (1449597587379,10)
[info] (1449597578379,1)
[info] (1449597580379,3)
[info] (1449597582379,5)
[info] (1449597584379,7)
[info] (1449597586379,9)