0
votes

I'm trying to run Apache Beam Python word count example on Apache 's Flink using PortableRunner using a SDK harness/Job Server via Docker.

  1. Built SDK harness container using ./gradlew -p sdks/python/container docker. But when doing docker pull on the image created, it gives below error :

    Using default tag: latest Error response from daemon: Get https://$userId-docker- apache.bintray.io/v2/: x509: certificate is valid for *.bintray.io, bintray.io, not $userId-docker- apache.bintray.io

  2. Successfully started the Flink portable Jobservice endpoint using ./gradlew beam-runners-flink_2.11-job-server:runShadow.

But when trying to run the wordcount example using below command with PortableRunner,

python -m apache_beam.examples.wordcount --input=local_input_file --output=local_output_file --job_endpoint=localhost:8099 --experiments beam_fn_api --runner=PortableRunner

it gives the below error:

IOError as “RuntimeError: IOError: [Errno 2] No such file or directory: '/beam-temp-output-b6d55cb671ef11e9be2f025000000001/3ce015aa-78ee-4bfa-be17-120de259e690.output' [while running 'write/Write/WriteImpl/FinalizeWrite’]”

Running with DirectRunner instead of PortableRunner gets it to work fine though! Any hint about how i can get the wordcount to work with PortableRunner via Docker is appreciated.

2

2 Answers

0
votes

Did you try specifying repository name and pull from there? (by using -Pdocker-repository-root)

something like this:

"./gradlew -Pdocker-repository-root=gcr.io/SOME_NAME_HERE -p sdks/go/container docker"

0
votes

I just encountered the same issue. Try this:

./gradlew docker

Relevant documentation here