0
votes

I'm trying to write some data to a BigQuery table from my Dataflow Pipeline, but the writes are failing with the following error message in stackdriver :

{
 "error": {
  "errors": [
   {
    "domain": "global",
    "reason": "required",
    "message": "Login Required",
    "locationType": "header",
    "location": "Authorization"
   }
  ],
  "code": 401,
  "message": "Login Required"
 }
}

I've already tried authenticating my gcloud CLI tool by using gcloud auth application-default login and gcloud auth login prior to running the Dataflow Pipeline from my local machine.

The API for BigQuery is also enabled in my Google Cloud console, and this entire setup worked just fine few days ago.

What I think is happening here is that my Dataflow Pipeline doesn't have enough privileges to write to my BQ Table, but I can't find a way to fix this in the docs.

Would appreciate any leads on this.

1
Are you running the pipeline on Dataflow, or on your local machine using the DirectRunner? - MrtN
Are you remembering to set GOOGLE_APPLICATION_CREDENTIALS before kicking off your pipeline/code? cloud.google.com/docs/authentication/… - Graham Polley
@MrtN I'm running it on Dataflow - harshithdwivedi
@GrahamPolley Yep, done that as my dataflow is already running perfectly fine, which wouldn't have been possible if the credentials were not in place - harshithdwivedi
Can you share your code please.. - Graham Polley

1 Answers

0
votes

The Dataflow runners use a special service account to access resources like BigQuery. You have to give the following service account access to BigQuery:

<project-number>[email protected]

It is called the Controller Service Account. In the Dataflow documentation you can also find specific information about Accessing BigQuery Datasets.

This does not explain why it worked before or why you get a 401 in stead of a 403, but I hope this helps you run your Datafow job.