1
votes

When I'm using spark 1.6.1, everything is alright. When I switch to Spark 2.1.0, I come across the problem below:

Task 33 in stage3.0 failed 4times; aborting job

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 33 in stage 3.0 failed 4 times, most recent failure: Lost taks 33.3 in stage 3.0 (TID 310, 192.168.1.5, executor 3): java.io.invalidclassexception scala.tuple2; local class incompatible; local class incompatible: stream classdesc serialVersionUID = -4864544146559264103, local class serialVersionUID = 3356420310891166197

I know -4864544146559264103 is correspond to scala 2.10, while 3356420310891166197 is correspond to scala 2.11. Although I changed my configuration to

EDIT: the entire pom file shows below.

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>test.spark</groupId>
  <artifactId>spark</artifactId>
  <version>1.0-SNAPSHOT</version>
  <packaging>jar</packaging>

  <name>spark</name>
  <url>http://maven.apache.org</url>

  <properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
  </properties>

  <build>
    <plugins>
      <plugin>
        <!-- solve the problem of : java.lang.ClassNotFoundException: kafka.producer.ProducerConfig -->
        <artifactId>maven-assembly-plugin</artifactId>
        <version>2.4</version>
        <configuration>
          <descriptorRefs>
            <descriptorRef>jar-with-dependencies</descriptorRef>
          </descriptorRefs>
        </configuration>
        <executions>
          <execution>
            <id>make-assembly</id>
            <phase>package</phase>
            <goals>
              <goal>single</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
      <plugin>
        <groupId>org.apache.maven.plugins</groupId>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.6.1</version>
        <configuration>
          <source>1.8</source>
          <target>1.8</target>
        </configuration>
      </plugin>
    </plugins>
  </build>

  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming_2.11</artifactId>
      <version>2.1.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.11</artifactId>
      <version>2.1.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-mllib_2.11</artifactId>
      <version>2.1.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-streaming-kafka-0-8_2.11</artifactId>
      <version>2.1.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.hbase</groupId>
      <artifactId>hbase-server</artifactId>
      <version>1.1.2</version>
    </dependency>
    <dependency>
      <groupId>org.apache.hbase</groupId>
      <artifactId>hbase-client</artifactId>
      <version>1.1.2</version>
    </dependency>
    <dependency>
      <groupId>org.apache.hbase</groupId>
      <artifactId>hbase-common</artifactId>
      <version>1.1.2</version>
    </dependency>
    <dependency>
      <groupId>io.fastjson</groupId>
      <artifactId>boon</artifactId>
      <version>0.33</version>
    </dependency>
    <dependency>
      <groupId>com.google.code.gson</groupId>
      <artifactId>gson</artifactId>
      <version>2.7</version>
    </dependency>
    <dependency>
      <groupId>com.googlecode.json-simple</groupId>
      <artifactId>json-simple</artifactId>
      <version>1.1.1</version>
    </dependency>
  </dependencies>
</project>

the problem is still exists. How to fix this problem? Any detail needed will be added. Thanks for any help!

1
is this your entire pom? Do you have other dependency on Scala libraries (directly or indirectly)? - Tzach Zohar
Why do you need 3 json libraries? - OneCricketeer
Thanks for your reply. I added the entire pom. All the dependency refers to Scala 2.11, so why the scala version in the stream is 2.10? - stupig
Hello cricket_007, Json library is used for other purpose, I tested several types of json. - stupig

1 Answers

0
votes

Finally, I fixed this problem. It's my fault, the pom file is alright, and the project works well.

The problem is resulted from one detail, the code reads scala.Tuple2 object from HDFS, which is not mentioned in my question (I'm sorry to say this). The objects in HDFS are generated with scala 2.10 by another project, so the problem occurs.

Anyway, thanks for your help.