0
votes

I'm trying to get an example of kafka and spark-streaming working and I find problems when running the process.

this is the exception:

[error] Caused by: com.fasterxml.jackson.databind.JsonMappingException: Incompatible Jackson version: 2.9.8

This is the build.sbt:

name := "SparkJobs"

version := "1.0"

scalaVersion := "2.11.6"

val sparkVersion = "2.4.1"

val flinkVersion = "1.7.2"

resolvers ++= Seq(
"Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/",
"apache snapshots" at "http://repository.apache.org/snapshots/",
"confluent.io" at "http://packages.confluent.io/maven/",
"Maven central" at "http://repo1.maven.org/maven2/"
)

libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion,
"org.apache.spark" %% "spark-sql" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVersion,
"org.apache.spark" %% "spark-hive" % sparkVersion

// ,"org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion
, "org.apache.kafka" %% "kafka-streams-scala" % "2.2.0"
// , "io.confluent" % "kafka-streams-avro-serde" % "5.2.1"
)

//excludeDependencies ++= Seq(
// commons-logging is replaced by jcl-over-slf4j
//  ExclusionRule("jackson-module-scala", "jackson-module-scala")
//
)

This is the code

Doing a sbt dependencyTree I can see that spark-core_2.11-2.4.1.jar has jackson-databind-2.6.7.1, and it is telling me that it is evicted by 2.9.8 version, which it suggest that there is a collision between libraries, but spark-core_2.11-2.4.1.jar is not the only one, kafka-streams-scala_2.11:2.2.0 uses jackson-databind-2.9.8 version, so I don't know which library has to evict jackson-databind-2.9.8. Spark-core / kafka-streams-scala? or both?

How can I avoid jackson library version 2.9.8 in order to get this task up and running?

I am assuming that I need jackson-databind-2.6.7 version ...

UPDATE with advices. Still not working.

I have deleted dependencies of kafka-stream-scala, which tries to use jackson 2.9.8, using this build.sbt

name := "SparkJobs"

version := "1.0"

scalaVersion := "2.11.6"

val sparkVersion = "2.4.1"

val flinkVersion = "1.7.2"

val kafkaStreamScala = "2.2.0"

resolvers ++= Seq(
"Typesafe Releases" at "http://repo.typesafe.com/typesafe/releases/",
"apache snapshots" at "http://repository.apache.org/snapshots/",
"confluent.io" at "http://packages.confluent.io/maven/",
"Maven central" at "http://repo1.maven.org/maven2/"
)


libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % sparkVersion ,
"org.apache.spark" %% "spark-sql" % sparkVersion,
"org.apache.spark" %% "spark-streaming" % sparkVersion,
"org.apache.spark" %% "spark-streaming-kafka-0-10" % sparkVersion,
"org.apache.spark" %% "spark-hive" % sparkVersion

)

But i got new exception

UPDATE 2

got it, now i understand the second exception, i forgot to awaitToTermination.

1
You should not include kafka-streams-scala as Kafka Streams is not compatible with Spark Streaming API.OneCricketeer
Thank you @cricket_007, i removed kafka-streams-scala, but this exception happen: ERROR Utils: throw uncaught fatal error in thread spark-listener-group-shared. I updated the question with complete exception.aironman

1 Answers

1
votes

Kafka Streams includes Jackson 2.9.8

But you don't need it when using Spark Streaming's Kafka Integration, so you should really just remove it.

Similarly, the kafka-streams-avro-serde isn't what you want to be using with Spark, rather you might find AbraOSS/ABRiS useful instead.