Spark Streaming from Kafka Consumer

Question

I might need to work with Kafka and I am absolutely new to it. I understand that there are Kafka producers which will publish the logs(called events or messages or records in Kafka) to the Kafka topics.

I will need to work on reading from Kafka topics via consumer. Do I need to set up consumer API first then I can stream using SparkStreaming Context(PySpark) or I can directly use KafkaUtils module to read from kafka topics?

In case I need to setup the Kafka consumer application, how do I do that? Please can you share links to right docs.

Thanks in Advance!!

Sandeep Purohit Sandeep Purohit · Accepted Answer · 2016-07-01T06:01:53

Spark provide internal kafka stream in which u dont need to create custom consumer there is 2 approach to connect with kafka 1 with receiver 2. direct approach. For more detail go through this link http://spark.apache.org/docs/latest/streaming-kafka-integration.html

Spark Streaming from Kafka Consumer

2 Answers