2
votes

Using apache beam 2.31 over Dataflow using java.
Window:

Window.<KV<String, HeartBeat>>into(Sessions.withGapDuration(
                        Duration.standardMinutes(30)))
                .triggering(Repeatedly.forever(AfterWatermark.pastEndOfWindow()
                        .withEarlyFirings(
                                AfterProcessingTime
                                        .pastFirstElementInPane().plusDelayOf(Duration.standardMinutes(20))
                        )
                ))
                .withAllowedLateness(Duration.standardMinutes(0))
                .accumulatingFiredPanes();

After applying the window theres a groupby and then the function that process the panes:

  boolean isFinished = context.pane().getTiming() != PaneInfo.Timing.EARLY;

Expected behavior: That an on-time pane will be fired as well as the early once. As written here: Apache Beam number of times a pane is fired with early triggers the "on-time" have to fired

Actual behavior: isFinished always stays false and on-time never fires when using dataflow.

Solutions that didn't work:

  1. Adding allowed lateness, and then marking LATE panes as finished - worked partially
  2. Using Window.ClosingBehavior.FIRE_ALWAYS and withOnTimeBehavior(Window.OnTimeBehavior.FIRE_ALWAYS) - still had no on-time fire.

I wish to know how to solve this, as well why it works locally and not when deploying. Is it a dataflow problem? or wrong trigger?

1

1 Answers

0
votes

Found my bug - Problem was with timestamps of protobuff weren't processed right.