0
votes

I am using Flink 1.2.1 running on Docker, with Task Managers distributed across different VMs as part of a Docker Swarm.

Uploading an Apache Beam application using the Flink Web UI and trying to set the parallelism at job submission point doesn't work. Neither does submit a job using the Flink CLI.

It seems like the parallelism doesn't get picked up at client level, it ends up defaulting to 1.

When I set the parallelism programmatically within the Apache Beam code, it works: flinkPipelineOptions.setParallelism(4);

I suspect the root of the problem may be in the org.apache.beam.runners.flink.DefaultParallelismFactory class, as it checks for Flink's GlobalConfiguration, which may not pick up runtime values passed to Flink.

Any ideas on how this could be fixed or worked around? I need to be able to change the parallelism dynamically, so the programmatic approach won't work, nor will setting the Flink configuration at system level.

I am using the following documentation: https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/parallel.html https://beam.apache.org/documentation/sdks/javadoc/2.0.0/org/apache/beam/runners/flink/DefaultParallelismFactory.html

1

1 Answers

0
votes

This should probably be fixed in the Beam Flink Runner but as a workaround you can try setting the parallelism to -1 programatically. This should make the translation pick up the parallelism that is specified when submitting the job.