0
votes

I would execute my Wordcount project on DataFlow Runner, i use command from beam documentation, but I got this error :

C:\BIGDATA_FORMATION\TP_WC_GCP\word-count-beam>mvn compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount -Dexec.args="--runner=DataflowRunner --gcpTempLocation=gs://bucket_wc/tmp/ --project=WordCount --inputFile=gs://bucket_wc/test_file.txt --output=gs://bucket_wc/counts" -Pdataflow-runner
[INFO] Scanning for projects...
[INFO]
[INFO] --------------------< org.example:word-count-beam >---------------------
[INFO] Building word-count-beam 0.1
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ word-count-beam ---
[WARNING] Using platform encoding (Cp1252 actually) to copy filtered resources, i.e. build is platform dependent!
[INFO] skip non existing resourceDirectory C:\BIGDATA_FORMATION\TP_WC_GCP\word-count-beam\src\main\resources
[INFO]
[INFO] --- maven-compiler-plugin:3.7.0:compile (default-compile) @ word-count-beam ---
[INFO] Nothing to compile - all classes are up to date
[INFO]
[INFO] --- exec-maven-plugin:1.6.0:java (default-cli) @ word-count-beam ---
sept. 01, 2020 5:16:16 PM org.apache.beam.runners.dataflow.options.DataflowPipelineOptions$StagingLocationFactory create
INFOS: No stagingLocation provided, falling back to gcpTempLocation
[WARNING]
java.lang.RuntimeException: Failed to construct instance from factory method DataflowRunner#fromOptions(interface org.apache.beam.sdk.options.PipelineOptions)
    at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod (InstanceBuilder.java:224)
    at org.apache.beam.sdk.util.InstanceBuilder.build (InstanceBuilder.java:155)
    at org.apache.beam.sdk.PipelineRunner.fromOptions (PipelineRunner.java:55)
    at org.apache.beam.sdk.Pipeline.create (Pipeline.java:147)
    at org.apache.beam.examples.WordCount.runWordCount (WordCount.java:176)
    at org.apache.beam.examples.WordCount.main (WordCount.java:192)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:282)
    at java.lang.Thread.run (Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod (InstanceBuilder.java:214)
    at org.apache.beam.sdk.util.InstanceBuilder.build (InstanceBuilder.java:155)
    at org.apache.beam.sdk.PipelineRunner.fromOptions (PipelineRunner.java:55)
    at org.apache.beam.sdk.Pipeline.create (Pipeline.java:147)
    at org.apache.beam.examples.WordCount.runWordCount (WordCount.java:176)
    at org.apache.beam.examples.WordCount.main (WordCount.java:192)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:282)
    at java.lang.Thread.run (Thread.java:745)
Caused by: java.lang.IllegalArgumentException: No files to stage has been found.
    at org.apache.beam.runners.dataflow.DataflowRunner.fromOptions (DataflowRunner.java:281)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.apache.beam.sdk.util.InstanceBuilder.buildFromMethod (InstanceBuilder.java:214)
    at org.apache.beam.sdk.util.InstanceBuilder.build (InstanceBuilder.java:155)
    at org.apache.beam.sdk.PipelineRunner.fromOptions (PipelineRunner.java:55)
    at org.apache.beam.sdk.Pipeline.create (Pipeline.java:147)
    at org.apache.beam.examples.WordCount.runWordCount (WordCount.java:176)
    at org.apache.beam.examples.WordCount.main (WordCount.java:192)
    at sun.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:498)
    at org.codehaus.mojo.exec.ExecJavaMojo$1.run (ExecJavaMojo.java:282)
    at java.lang.Thread.run (Thread.java:745)
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  7.202 s
[INFO] Finished at: 2020-09-01T17:16:17+01:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.6.0:java (default-cli) on project word-count-beam: An exception occured while executing the Java class. Failed to construct instance from factory method DataflowRunner#fromOptions(interface org.apache.beam.sdk.options.PipelineOptions): InvocationTargetException: No files to stage has been found. -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

I can't solve this, i followed all steps before running the command but i couldn't run it at the end. ############################################################################################### ############################################################################################

1
Have you tried manually specifying stagingLocation? - robertwb
Also, can you run it with the direct runner? - robertwb

1 Answers

0
votes

Try to reproduce the Wordcount project in a new or an empty folder by following w the documentation from Dataflow.

First you have to create a Maven project containing the Apache Beam SDK's WordCount by running the mvn archetype:generate command in your GCP shell.

mvn archetype:generate \
      -DarchetypeGroupId=org.apache.beam \
      -DarchetypeArtifactId=beam-sdks-java-maven-archetypes-examples \
      -DarchetypeVersion=2.23.0 \
      -DgroupId=org.example \
      -DartifactId=word-count-beam \
      -Dversion="0.1" \
      -Dpackage=org.apache.beam.examples \
      -DinteractiveMode=false

After the command run, go to the new directory called word-count-beam that already contains the pom.xml file. Finally build and run WordCount on the Dataflow service with the following command:

mvn -Pdataflow-runner compile exec:java -Dexec.mainClass=org.apache.beam.examples.WordCount -Dexec.args="--project=<PROJECT_ID> \
      --gcpTempLocation=gs://<BUCKET_NAME>/tmp/ \
      --stagingLocation=gs://<BUCKET_NAME>/staging/ \
      --output=gs://<BUCKET_NAME>/output \
      --runner=DataflowRunner \
      --inputFile=gs://<BUCKET_NAME>/test.txt \
      --region=us-west1"

You can verify the results in dataflow and bucket storage