I have CDH 5.7.0 with spark 1.6.0 and kafka 0.9.0 and I need to run a Spark streaming job that consumes messages from a kafka broker in another cluster with 0.8.2.2 version. I create a stream like:
val stream = KafkaUtils.createStream(ssc, Utils.settings.zookeeperQuorum, Utils.settings.kafkaGroup, Utils.settings.topicMapWifi)
In the build.sbt I'm adding:
libraryDependencies += "org.apache.spark" %% "spark-streaming-kafka" % "1.2.0"
with that library I would be using a client that fits a broker with version 0.8.2.x. But the problem is that Spark is loading a ton stuff from CDH claspath in:
/opt/cloudera/parcels/CDH-5.7.0-1.cdh5.7.0.p0.45/lib/spark/bin/spark-class
and is adding a newer version of kafka client than the one I need. Is there a way to override specific libraries from code?