How Kafka partitions are shared in Spark streaming with Kafka?

Question

I am wondering how the Kafka partitions are shared among the SimpleConsumer being run from inside the executor processes. I know how the high level Kafka consumers are sharing the parititions across differernt consumers in the consumer group. But how does that happen when Spark is using the Simple consumer ? There will be multiple executors for the streaming jobs across machines.

OneCricketeer OneCricketeer · Accepted Answer · 2018-05-18T12:53:14

All Spark executors should also be part of the same consumer group. Spark is using roughly the same Java API for Kafka consumers, it's just the scheduling that's distributing it into multiple machines

How Kafka partitions are shared in Spark streaming with Kafka?

1 Answers