0
votes

Assuming there is a finite DataStream (from a database source, for example) with events

  • a1, a2, ..., an.

How to append one more event b to this stream to get

  • a1, a2, ..., an, b

(i.e. output the added event after all original events, preserving the original ordering)?

I know that all finite streams emit the MAX_WATERMARK after all events. So, is there a way to "catch" this watermark and output the additional event after it?

(Unfortunately, .union()ing the original DataStream with another DataStream consisting of a single event (with timestamp set to Long.MaxValue) and then sorting the united stream using this answer did not work.)

2
Do you know the count ahead of time? Also, if it is a finite set, why can't you use the DataSet API instead of the DataStream?austin_ce

2 Answers

2
votes

Maybe I'm missing something, but it seems like you could simply have a ProcessFunction with an event time timer set for somewhere in the distant future, so that it only fires when the MAX_WATERMARK arrives. And then in the onTimer method, emit that special event if the currentWatermark is MAX_WATERMARK.

0
votes

Another approach might be to 'wrap' the original data source in another data source, which emits a final element when the delegate object's run() method returns. You'd need to be careful to call through to all of the delegate methods, of course.