1
votes

I am new to Apache beam. So far, my understanding is, apache beam is nothing but the tool for ETL processing. Runners can be called as collection of CPU, memory and storage.

My question is, can i use two or more types of runners in single beam python code?

for example, one runner for dataflow, another for spark, third for directrunner, like this?

1

1 Answers

1
votes

You can take your Beam pipeline, and submit it to be run on different runners.

You cannot make different runners work together (e.g. a pipeline that runs partially on Dataflow and partially on Spark).

Instead, you can write a pipeline that sometimes runs fully on Dataflow and sometimes runs fully on Spark.

LMK if I should clarify further.