1
votes

I have a pipeline stored as a template. I'm using the node.js client to run this pipeline from a cloud function. Everything works fine, but when I need to run this template from different regions I get errors.

According to the documentation, I can set it through the location parameter in the payload

{
  projectId: 123,
  resource: {
    location: "europe-west1",
    jobName: `xxx`,
    gcsPath: 'gs://xxx'
  }
}

That gives me the following error:

The workflow could not be created, since it was sent to an invalid regional endpoint (europe-west1). 
Please resubmit to a valid Cloud Dataflow regional endpoint.

I get the same error if I move the location parameter out of resource node, such as:

{
  projectId: 123,
  location: "europe-west1",
  resource: {
    jobName: `xxx`,
    gcsPath: 'gs://xxx'
  }
}

If I set the zone in the environment and remove the location such as:

{
  projectId: 123,
  resource: {
    jobName: `xxx`,
    gcsPath: 'gs://xxx',
    environment: {
        zone: "europe-west1-b"
    }
   }
}

I do not get any errors anymore, but dataflow UI tells me the job is running in us-east1

How can I run this template and providind the region / zone I

2
Does it work if you specify us-central1 using location too? Are you using a deprecated SDK or Shuffle service that do not support all regional endpoints? - Guillem Xercavins
Yes it does if I specify us-central1. I'm using the latest googleapis version - benjamin.d

2 Answers

8
votes

As explained here, there are actually two endpoints:

For Dataflow regional endpoints to work, the first one must be used (dataflow.projects.locations.templates.launch). This way, the location parameter in the request will be accepted. Code snippet:

var dataflow = google.dataflow({
    version: "v1b3",
    auth: authClient
});

var opts = {
    projectId: project,
    location: "europe-west1",
    gcsPath: "gs://path/to/template",
    resource: {
        parameters: launchParams,
        environment: env
    }
};
dataflow.projects.locations.templates.launch(opts, (err, result) => {
    if (err) {
        throw err;
    }
    res.send(result.data);
});
1
votes

I have been testing this though both the API explorer and the console using Google-provided templates. Using the wordcount example I get the same generic error than you do with the API explorer, which is the same if the location name is incorrect. However, the Console provides more information:

Templated Dataflow jobs using Java or Python SDK version prior to 2.0 are not supported outside of the us-central1 Dataflow Regional Endpoint. The provided template uses Google Cloud Dataflow SDK for Java 1.9.1.

Which is documented here as I previously commented. Running it confirms it's using a deprecated SDK version. I would recommend doing the same process to see if this is actually your case, too.

Choosing a different template, in my case the GCS Text to BigQuery option from the Console's drop-down menu (which uses Apache Beam SDK for Java 2.2.0) with location set to europe-west1 works fine for me (and the job actually runs in that region).

TL;DR: your request is correct in your first example but you'll need to update the template to a newer SDK if you want to use regional endpoints.