I have been trying to execute a Dataflow pipeline(Python) in my project where my GCP account is assigned with "Owner" role.
Pipeline performs below tasks.
- Read data from BigQuery (same project where Dataflow pipeline is running).
- Apply some transformations
- Finally load the resultant data to GCS
As per my understanding Dataflow workers uses default compute engine service account([email protected]) to access other services on GCP including BigQuery and [email protected] has "Editor" role.
But when I am trying to run pipeline using DataflowRunner getiing below error.
Error:
BigQuery execution failed., Error: Message: Access Denied: Project : User does not have bigquery.jobs.create permission in project . HTTP Code: 403
This is running fine with DirectRunner.
I also tried to run this pipeline by assigning DataFlow worker , Dataflow Admin roles to
[email protected] despite this has "Editor" role. But this pipeline failing with the same error.
Could you please help with your inputs to resolve this issue?
Execution command:
python -m bigquery_to_gcs --input gs://<GCS_path>/input --output gs://<GCS_path>/results/output.txt --project --region us-central1 --staging_location gs://<GCS_path>/staging --temp_location gs://<GCS_path>/tmp --runner DataflowRunner