2
votes

I'd like to trigger a Dataflow job when new files are added to a Storage bucket in order to process and add new data into a BigQuery table. I see that Cloud Functions can be triggered by changes in the bucket, but I haven't found a way to start a Dataflow job using the gcloud node.js library.

Is there a way to do this using Cloud Functions or is there an alternative way of achieving the desired result (inserting new data to BigQuery when files are added to a Storage bucket)?

2
There's an example of starting a Dataflow in this answer; does this help? stackoverflow.com/questions/35415868/…Sam McVeety
Thanks, this is useful indeed. I'm using the Dataflow Python SDK but hopefully that won't be an issue.numentar
Please see my edited answer.jkff

2 Answers

2
votes

This is supported in Apache Beam starting with 2.2. See Watching for new files matching a filepattern in Apache Beam.

2
votes