0
votes

I have been trying to execute a Dataflow pipeline(Python) in my project where my GCP account is assigned with "Owner" role.

Pipeline performs below tasks.

  1. Read data from BigQuery (same project where Dataflow pipeline is running).
  2. Apply some transformations
  3. Finally load the resultant data to GCS

As per my understanding Dataflow workers uses default compute engine service account([email protected]) to access other services on GCP including BigQuery and [email protected] has "Editor" role.

But when I am trying to run pipeline using DataflowRunner getiing below error.

Error:

BigQuery execution failed., Error: Message: Access Denied: Project : User does not have bigquery.jobs.create permission in project . HTTP Code: 403

This is running fine with DirectRunner.

I also tried to run this pipeline by assigning DataFlow worker , Dataflow Admin roles to
[email protected] despite this has "Editor" role. But this pipeline failing with the same error.

Could you please help with your inputs to resolve this issue?

Execution command:

python -m bigquery_to_gcs --input gs://<GCS_path>/input --output gs://<GCS_path>/results/output.txt --project --region us-central1 --staging_location gs://<GCS_path>/staging --temp_location gs://<GCS_path>/tmp --runner DataflowRunner