I'm writing to BigQuery in a beam job from an unbounded source. I'm using STREAMING INSERTS as the Method. I was looking at how to throttle the rows to BigQuery based on the recommendations in
The BigQueryIO.Write API doesn't provide a way to set the micro batches.
I was looking at using Triggers but not sure if BigQuery groups everything in a pane into a request. I've setup the trigger as below
Window.<Long>into(new GlobalWindows())
.triggering(
Repeatedly.forever(
AfterFirst.of(
AfterPane.elementCountAtLeast(5),
AfterProcessingTime.pastFirstElementInPane().plusDelayOf(Duration.standardMinutes(2)))
))
.discardingFiredPanes());
Q1. Does Beam support micro batches or does it create one request for each element in the PCollection?
Q2. If the above trigger makes sense? Even If I set the window/trigger it could be sending one request for every element.