I am tring to run the spark streaming example : Directkafkawordcount.scala
To create jar I am using "build.sbt" with plugin:
name := "Kafka Direct"
version := "1.0"
scalaVersion := "2.11.6"
libraryDependencies ++= Seq ("org.apache.spark" % "spark-streaming-kafka-0-8_2.11" % "2.1.0",
"org.apache.spark" % "spark-streaming_2.11" % "2.1.0",
"org.apache.spark" % "spark-core_2.11" % "2.1.0" exclude("com.esotericsoftware.minlog", "minlog")
)
resolvers ++= Seq(
"Maven Central" at "https://repo1.maven.org/maven2/"
)
mergeStrategy in assembly := {
case m if m.toLowerCase.endsWith("manifest.mf") => MergeStrategy.discard
case m if m.toLowerCase.matches("meta-inf.*\\.sf$") => MergeStrategy.discard
case "log4j.properties" => MergeStrategy.discard
case m if m.toLowerCase.startsWith("meta-inf/services/") => MergeStrategy.filterDistinctLines
case "reference.conf" => MergeStrategy.concat
case _ => MergeStrategy.first
case PathList(ps @ _*) if ps.last endsWith "pom.properties" => MergeStrategy.discard
case x => val oldStrategy = (assemblyMergeStrategy in assembly).value
oldStrategy(x)
}
"sbt package" is successful but when I want to submit with from target/scala-2.11/classes directory: "spark-submit --class org.apache.spark.examples.streaming.DirectKafkaWordCount --master local[2] /home/hadoop/DirectKafkaProject/target/scala-2.11/kafka-direct_2.11-1.0.jar localhost:9092 xyz123"
it give me this error:
Caused by: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaUtils$ at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
I already set SPARK_CLASSPATH and SPARK_LOCAL_IP. I already try with -jars option but it is asking for another .jar file and it keep asking for other .jar files. I had done every thing what every this site suggested but I am no able to solve my problem. Scala version: 2.11.6 spark version: 2.1.0 kafka version: 2.11-0.10.2.0 .
Please help me . Thanks.