4
votes

I am using maven

i have added the following dependencies

   <dependency> <!-- Spark dependency -->
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.10</artifactId>
      <version>1.1.0</version>
    </dependency>   <dependency> <!-- Spark dependency -->
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming-kafka_2.10</artifactId>
      <version>1.1.0</version>
    </dependency>

I have also added the jar in the code

SparkConf sparkConf = new SparkConf().setAppName("KafkaSparkTest");
JavaSparkContext sc = new JavaSparkContext(sparkConf);
sc.addJar("/home/test/.m2/repository/org/apache/spark/spark-streaming-kafka_2.10/1.0.2/spark-streaming-kafka_2.10-1.0.2.jar");
JavaStreamingContext jssc = new JavaStreamingContext(sc, new Duration(5000)); 

It comples fine with out any error , i am getting the following error when i run through spark-submit, any help is much appreciated. Thanks for your time.

bin/spark-submit --class "KafkaSparkStreaming" --master local[4] try/simple-project/target/simple-project-1.0.jar

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/streaming/kafka/KafkaUtils at KafkaSparkStreaming.sparkStreamingTest(KafkaSparkStreaming.java:40) at KafkaSparkStreaming.main(KafkaSparkStreaming.java:23) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:303) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaUtils at java.net.URLClassLoader$1.run(URLClassLoader.java:366)

2
Your addJar method is not neccesary. However, you should add maven assembly plugin and package in a .jar with dependencies included (Do not find spark-kafka dependency).gasparms
i have added spark-streaming-kafka_2.10 in the dependancy list in pommithra
thanks adding maven assembly plugin helped..mithra

2 Answers

9
votes

I meet the same problem, I solved it by build the jar with dependencies.

  1. remove "sc.addJar()" in your code.

  2. add the code below to pom.xml

    <build>
        <sourceDirectory>src/main/java</sourceDirectory>
        <testSourceDirectory>src/test/java</testSourceDirectory>
        <plugins>
          <!--
                       Bind the maven-assembly-plugin to the package phase
            this will create a jar file without the storm dependencies
            suitable for deployment to a cluster.
           -->
          <plugin>
            <artifactId>maven-assembly-plugin</artifactId>
            <configuration>
              <descriptorRefs>
                <descriptorRef>jar-with-dependencies</descriptorRef>
              </descriptorRefs>
              <archive>
                <manifest>
                  <mainClass></mainClass>
                </manifest>
              </archive>
            </configuration>
            <executions>
              <execution>
                <id>make-assembly</id>
                <phase>package</phase>
                <goals>
                  <goal>single</goal>
                </goals>
              </execution>
            </executions>
          </plugin>
        </plugins>
    </build>    
    
  3. mvn package

  4. submit the "example-jar-with-dependencies.jar"

0
votes

For future reference, if you get a ClassNotFoundException, if you search for "org.apache.spark..." you will be taken to the maven page where it will tell you the dependency you are missing in your pom file. It will also give you the code to put in your pom file.