1
votes

My test pipeline is really simple. It attempts to read from a topic created via the console.

public static void main(String[] args) throws IOException {

    Options options = PipelineOptionsFactory.fromArgs(args).
            withValidation().as(Options.class);

    options.setStreaming(true);

    Pipeline pipeline = Pipeline.create(options);

    PCollection<String> input = pipeline
            .apply(PubsubIO.Read.topic(options.getPubsubTopic()))
            .apply(ParDo.of(new ExtractEvents()));

    pipeline.run();

}

When I attempt to execute this pipeline I get the following error:

Workflow failed. Causes: (de5f777e2e08c1d9): Step setup_resource_additionaltopic.subscription-375367840492394866711: Set up of resource additionaltopic.subscription-3753678404923948667 failed

The Dataflow console also reports an internal error:

Dataflow console error

I can't find anything in the documentation and my trial and error attempts at resolving this haven't been successful.

Solution

To run a Dataflow job, a project must enable the following Google Cloud Platform APIs:

  • Google Cloud Dataflow API
  • Compute Engine API (Google Compute Engine)
  • Google Cloud Logging API
  • Google Cloud Storage
  • Google Cloud Storage JSON API
  • BigQuery API
  • Google Cloud Pub/Sub
  • Google Cloud Datastore API

You can use the Google Cloud Platform Console to enable all the required APIs at once.

1

1 Answers

0
votes

It looks like your project doesn't have the Pubsub API enabled. Have you gone through the instructions in the getting started guide, especially the part on APIs. There is a link there that should enable everything you need to get going.

You may also need to verify that the APIs and permissions are properly setup for the project that the Dataflow job is running in to access the Pubsub topic you are trying to subscribe to.