1
votes

I am trying to submit a spark streaming + kafka job which just reads lines of string from a kafka topic. However, I am getting the following exception

15/07/24 22:39:45 ERROR TaskSetManager: Task 0 in stage 2.0 failed 4 times; aborting job Exception in thread "Thread-49" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 4 times, most recent failure: Lost task 0.3 in stage 2.0 (TID 73, 10.11.112.93): java.lang.NoSuchMethodException: kafka.serializer.StringDecoder.(kafka.utils.VerifiableProperties) java.lang.Class.getConstructor0(Class.java:2892) java.lang.Class.getConstructor(Class.java:1723) org.apache.spark.streaming.kafka.KafkaReceiver.onStart(KafkaInputDStream.scala:106) org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:121) org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:106) org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:264) org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverLauncher$$anonfun$9.apply(ReceiverTracker.scala:257) org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121) org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1121) org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62) org.apache.spark.scheduler.Task.run(Task.scala:54) org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177) java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) java.lang.Thread.run(Thread.java:745)

When I checked the spark jar files used by DSE, I see that it uses kafka_2.10-0.8.0.jar which does have that constructor. Not sure what is causing the error. Here is my consumer code

    val sc = new SparkContext(sparkConf)
    val streamingContext = new StreamingContext(sc, SLIDE_INTERVAL)

    val topicMap = kafkaTopics.split(",").map((_, numThreads.toInt)).toMap
    val accessLogsStream = KafkaUtils.createStream(streamingContext, zooKeeper, "AccessLogsKafkaAnalyzer", topicMap)

    val accessLogs = accessLogsStream.map(_._2).map(log => ApacheAccessLog.parseLogLine(log).cache()

UPDATE This exception seems to happen only when I submit the job. If I use the spark shell to run the job by pasting the code, it works fine

1

1 Answers

1
votes

I was facing the same issue with my custom decoder. I added the following constructor, which resolved the issue.

public YourDecoder(VerifiableProperties verifiableProperties)
{

}