Objective- I have a dataflow template (written in python) that has a dependency on pandas and nltk also I want to trigger the dataflow job from cloud function. For this purpose, I have uploaded the code to a bucket and I am ready to specify the template location in the cloud function.
Problem- How to pass the requirements_file parameter that you would normally pass to install any third-party library when you trigger a dataflow job using the discovery google module from cloud function?
Prerequisites- I know this can be done when you are launching a job through the local machine by specifying a local directory path but when I try to specify the path from GCS such as --requirements_file gs://bucket/requirements.txt it gives me an error saying:
The file gs://bucket/requirements.txt cannot be found. It was specified in the --requirements_file command line option.
gcloud functions deploy? Have a look here which is a quickstart and shows how to specify dependencies. - yvesonlinegcloud dataflow jobs run? - yvesonline