0
votes

I am getting below error while running beam java sdk model using dataflow runner.

java.lang.IllegalArgumentException: Class interface org.apache.beam.sdk.options.PipelineOptions missing a property named 'output'.
at org.apache.beam.sdk.options.PipelineOptionsFactory.parseObjects(PipelineOptionsFactory.java:1488)
    at org.apache.beam.sdk.options.PipelineOptionsFactory.access$400(PipelineOptionsFactory.java:110)
    at org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.as(PipelineOptionsFactory.java:297)
    at org.apache.beam.sdk.options.PipelineOptionsFactory$Builder.create(PipelineOptionsFactory.java:276)
    at com.aail.beam.StarterPipeline.main(StarterPipeline.java:51)

I had installed google cloud platform plugin in eclipse and directly created Google Cloud Dataflow Java project in eclipse itself and trying to execute the sample StarterPipeline job using dataflow pipeline runner by giving needed arguments as

--output=gs://wordcountt-storage-bucket/output/ 

I tried of solving the error by changing the --output=gs://wordcountt-storage-bucket/output/ argument as output=gs://wordcountt-storage-bucket/output/ . But it doesn't worked. I tried to run through command line instead of direct run in eclipse as follows

mvn compile exec:java \
  -Dexec.mainClass=com.aail.beam.StarterPipeline \
  -Dexec.args="--project=sample-wordcount-beam \
  --stagingLocation=gs://wordcountt-storage-bucket/staging/ \
  --output=gs://wordcountt-storage-bucket/output \
  --runner=DataflowRunner"

which in turn throws the same error.

Reference: Dataflow job failing with output property missing error

I tried following as shown in the reference link but that also doesn't work. So can anyone help me out how to resolve property output missing error.

2
Is there some documentation that you were following that seems to be incorrect that should be updated? - Lukasz Cwik
Yeah, this is the link which I am trying to implement practically. cloud.google.com/dataflow/docs/quickstarts/… At step-10 while I am specifying the arguments i am facing above mentioned error - bunny sunny
Which version of the SDK are you using since the tool says its only compatible with SDKs versioned 2.0.0 to 2.5.0?, If your using a version > 2.5.0, I would suggest that you follow the instructions at beam.apache.org/get-started/quickstart-java - Lukasz Cwik
@LukaszCwik, my sdk version was 2.5.0 only. - bunny sunny

2 Answers

0
votes

It looks like your attempting to supply arguments to run the WordCount example but are running the StarterPipeline. The error message is pointing out that the options argument can't be specified since the Pipeline doesn't take that argument so try running without it:

mvn compile exec:java \
  -Dexec.mainClass=com.aail.beam.StarterPipeline \
  -Dexec.args="--project=sample-wordcount-beam \
  --stagingLocation=gs://wordcountt-storage-bucket/staging/ \
  --runner=DataflowRunner"

Please take a look at the getting started guide if you want to run WordCount.

0
votes

You might be missing the dependency for the dataflow runner. Check the pom.xml for

<dependency>
  <groupId>org.apache.beam</groupId>
  <artifactId>beam-runners-google-cloud-dataflow-java</artifactId>
  <version>2.25.0</version>
  <scope>runtime</scope>
</dependency>

See also: https://beam.apache.org/documentation/runners/dataflow/