3
votes

Is there support for running python programs using Apache beam and SparkRunner?

The documentation doesn't seem to have it: https://beam.apache.org/get-started/wordcount-example/#apache-spark-runner

And when I look at the API reference https://beam.apache.org/documentation/sdks/pydoc/0.6.0/apache_beam.runners.html I don't find any mention of SparkRunner in there.

There is mention and support for Java I believe, but I'm wondering if the python support is there?

2

2 Answers

4
votes

There's no support for running pipelines built with Apache Beam's Python SDK on the Apache Spark at the moment. However, this work is in progress, embodied in the Apache Beam portability framework.

Stay tuned -- this is something that should be available relatively soon!

2
votes

Support for running Apache Beam Python pipelines has been added and there are some instructions on how to get started here.