1
votes

I have a use case where I have a Dataflow job running in streaming mode with an hourly fixed window.

When the pipeline runs for a given window, we calculate some data and write it to a data source. What I want to do next is publish some message to PubSub once the write is complete - how might I go about making sure that the write step is complete before writing to PubSub?

If the pipeline was executed in batch mode I know I could execute it in a blocking fashion as suggested here, but the tricky part is that this constantly running in streaming mode.

1
Where do you write? If in Storage, you could use Pub/Sub notifications for StorageLefteris S
Good point, thanks for clarifying - in this case I am writing to BigQuery, but the same issue would apply to any database.gilmatic

1 Answers

3
votes

Wait.on() transform is designed for this use case. See documentation for usage example.