0
votes

I am unable to authenticate my Dataflow Beam application when I run it in Intellij Idea. This worked for me at one point recently and now it doesn't.

Auth is failing with 403 forbidden '"Access Denied: Project [myProject]: User does not have bigquery.jobs.create permission in project [myProject].'

  1. I have verified that I DO have this permission in both my gcp user and service account.
  2. I have set GOOGLE_APPLICATION_CREDENTIALS with the path to a service account json in my MacOS Zshell profile.
  3. This same profile configuration works when I run a different client lib Node JS app in VSCode using the same service account token.
  4. This same java Dataflow pipeline authenticates when I run/debug it in Eclipse IDE.
  5. Running mvn package from terminal on the same pipeline is also authenticating and writing the template to my GCS storage bucket.
  6. I have added additional service accounts to my gcloud configuration with 'gcloud auth activate-service-account' and can see them listed with 'gcloud auth list'
  7. I have tried setting the active account to both service accounts.
  8. I have tried setting the --serviceAccount Beam option to a service account I know has correct permissions.
  9. I would like to try to setting the service token path to the BigQueryIO java Dataflow connector as I am able to do with the Node JS client lib, but it doesn't seem possible?
  10. The debugger does work and I can hit a break point.
  11. I have tried installing latest and Version: 2020.1.1 Build: 201.7223.91 29 April 2020
  12. I have tried uninstalling and reinstalling Intellij and creating a new project.

It appears as though this security context is not getting passed to the Dataflow Java Beam library, but the exception output does say 'Inferred default GCP project 'fubotv-prod' from gcloud.' so apparently some args are getting through.

Perhaps there is some cached response build state or something?

I spent all day stuck on this. I am at my wits end. I would really like to debug my Dataflow pipeline again with Intellij. Any solutions, ideas, random words of encouragement are much appreciated!

1
Are you using the same version of the libraries in every try? I mean, are using the exact same configuration and library versions when running the pipeline through Maven, Eclipse and Intellij? - rmesteves
It's the same project built from the same main with the same pom. But the way that Intellij builds/rujs the application seems different. I'm really not even sure how it works. Perhaps the key is in the run configuration? I don't fully understand what Intellij does to build and run a java project... - user3205931
I should point out related to this - when I run mvn package this does not actually run the application. This simply builds the template and related jars that are later run on the Dataflow platform as a job. But I know this still requires that google credentials are made available. They are different credentials - GCS write vs. BQ job creation - but these are coming from the same service account. My Node JS client lib still needs the same BQ job create creds though. - user3205931
Can you take a look in the logs to see if the user who is trying to access BigQuery is the correct one? ( cloud.google.com/logging/docs/audit#viewing_audit_logs) - rmesteves
Another question: have you tried to set the service account from your code directly? If you dont, have you tried accessing the GOOGLE_APPLICATION_CREDENTIALS variable from your code and printing it to see if its correctly populated? - rmesteves

1 Answers

0
votes

As discussed in the comments, a way of debugging what is happening can be printing the var env in your code in IntelliJ in order to see if the environment variables are correct

As concluded in the discussion, IntelliJ has its own SHELL context which makes the OS's variables not accessible if you don't explicitly restart the application.