Apache Flink - Event time

Question

I want to create an event time clock for my events in Apache flink. I am doing it in following way

public class TimeStampAssigner implements AssignerWithPeriodicWatermarks<Tuple2<String, String>> {


    private final long maxOutOfOrderness = 0; // 3.5 

    private long currentMaxTimestamp;

    @Override
    public long extractTimestamp(Tuple2<String, String> element, long previousElementTimestamp) {

        currentMaxTimestamp = new  Date().getTime();

        return currentMaxTimestamp;
    }



    @Override
    public Watermark getCurrentWatermark() {

        return new Watermark(currentMaxTimestamp - maxOutOfOrderness);


    }

}

Please check the above code and tell if I am doing it correctly. After the event time and watermark assignment i want to process the stream in process function in which i will be collecting the stream data for 10 minutes for different keys.

David Anderson David Anderson · Accepted Answer · 2018-09-03T16:51:52

No, this is not an appropriate implementation. An event time timestamp should be deterministic (i.e., reproducible), and it should be based on data in the event stream. If instead you are going to use Date().getTime, then you are more or less using processing time.

Typically when doing event time processing your events will have a timestamp field, and the timestamp extractor will return the value of this field.

The implementation you've shown will lose most of the benefits that come from working with event time, such as the ability to reprocess historic data in order to reproduce historic results.

Apache Flink - Event time

2 Answers