1
votes

As I know how to run Apache Beam in a Google Dataflow job, I should first set an environmental variable to my json credential file

set GOOGLE_APPLICATION_CREDENTIALS=/path/to/jsonfile.json

I want to automate this and I think I have to run a bash script by my java beam application first. Is there a better approach to do this in my beam Java class?

2

2 Answers

2
votes

Yes, there is a way to load the Json credential file from Java applications.

Please refer the below code snippet to create the Pipeline object with the Google credential reference loaded programmatically.

    //create scope list with DataFlow's scopes
    Set<String> scopeList = new HashSet<String>();
    scopeList.addAll(DataflowScopes.all());

    //create GoogleCredentials object with Json credential file & the scope collection prepared above
    GoogleCredentials credential = GoogleCredentials
                                         .fromStream(new FileInputStream("path-to-credential-json-file"))
                                         .createScoped(scopeList);

    //create default pipeline
    PipelineOptions  options = PipelineOptionsFactory.create();

    //assign the credential 
    options.as(GcpOptions.class).setGcpCredential( credential);

    Pipeline pipeLine = Pipeline.create(options);

This approach might help you not to depend on the GOOGLE_APPLICATION_CREDENTIALS environment variable.

It has worked on my environment, please let me know if you hit any issues with this.

-1
votes

As far as I know you cannot easily modify the environment variables of the executing program. That is you cannot do it from your main program that starts the pipeline. Setting it in the script is the best option here.

Alternatives are hacks similar to https://blog.sebastian-daschner.com/entries/changing_env_java , I do not recommend using these.