3
votes

Getting below error , same code is working in Databricks but not in Hdinsight. I have added the delta library and hadoop-azure library also in the classpath.

io.delta:delta-core_2.11:0.5.0,org.apache.hadoop:hadoop-azure:3.1.3

ERROR ApplicationMaster [Driver]: User class threw exception: com.google.common.util.concurrent.ExecutionError: java.lang.NoClassDefFoundError: com/fasterxml/jackson/module/scala/experimental/ScalaObjectMapper$class
com.google.common.util.concurrent.ExecutionError: java.lang.NoClassDefFoundError: com/fasterxml/jackson/module/scala/experimental/ScalaObjectMapper$class
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2049)
    at com.google.common.cache.LocalCache.get(LocalCache.java:3953)
    at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4873)
    at org.apache.spark.sql.delta.DeltaLog$.apply(DeltaLog.scala:740)
    at org.apache.spark.sql.delta.DeltaLog$.forTable(DeltaLog.scala:712)
    at org.apache.spark.sql.delta.sources.DeltaDataSource.createRelation(DeltaDataSource.scala:169)
    at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318)
    at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
    at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
    at io.delta.tables.DeltaTable$.forPath(DeltaTable.scala:635)
    
2
Could you check what's the version of jackson-module-scala in your classpath? Looks like you are using an incompatible version. - zsxwing
I am using 2.11.1 - NITIN GUPTA
<dependency> <groupId>com.fasterxml.jackson.module</groupId> <artifactId>jackson-module-scala_2.11</artifactId> <version>2.11.1</version> <scope>test</scope> </dependency> - NITIN GUPTA
Spark 2.4.6 is using 2.6.7.1 ( github.com/apache/spark/blob/v2.4.6/pom.xml#L162 ) It's better to use the same version. com/fasterxml/jackson/module/scala/experimental/ScalaObjectMapper is no longer in jackson-module-scala 2.11.1. - zsxwing
Thanks !! tried the same but still the same issue. Getting same error in spark shell also - NITIN GUPTA

2 Answers

1
votes

There is conflict between version of jackson-json libraries packaged with HDInsight and used by spark,deltalake

There are 2 options to get around this

  1. Packaged jackson json 2.6.7 version dependencies into your application (maven shade plugin or scala assembly)

Or

  1. Set below spark configurations, if you are using jupyter notebook
{"conf":
 {"spark.jars.packages": "io.delta:delta-core_2.11:0.5.0", 
    "spark.driver.extraClassPath":
     "${PATH}/jackson-module-scala_2.11-2.6.7.1.jar;${PATH}/jackson-annotations-2.6.7.jar;
      ${PATH}/jackson-core-2.6.7.jar;
      ${PATH}/jackson-databind-2.6.7.1.jar;
      ${PATH}/jackson-module-paranamer-2.6.7.jar",
   "spark.executor.extraClassPath":
     "${PATH}/jackson-module-scala_2.11-2.6.7.1.jar;${PATH}/jackson-annotations-2.6.7.jar;
      ${PATH}/jackson-core-2.6.7.jar;${PATH}/jackson-databind-2.6.7.1.jar;
      ${PATH}/jackson-module-paranamer-2.6.7.jar",
   "spark.driver.userClassPathFirst":true}}
0
votes

As mentioned by @blob, the error is a result of version conflict.

If you're using a maven based project, then you can easily configure your maven shade plugin to rename the Jackson related dependencies of delta so that the conflict is resolved.

<plugins>
... 
  <plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-shade-plugin</artifactId>
    <version>3.2.1</version>
    <executions>
      <execution>
        <phase>package</phase>
        <goals>
          <goal>shade</goal>
        </goals>
        <configuration>
          <finalName>NAME-OF-YOUR-SHADED-JAR-FILE</finalName>
          <filters> <!-- exclude these files from artifacts to avoid SecurityException on signed jars -->
            <filter>
              <artifact>*:*</artifact>
              <excludes>
                <exclude>META-INF/LICENSE</exclude>
                <exclude>META-INF/*.SF</exclude>
                <exclude>META-INF/*.DSA</exclude>
                <exclude>META-INF/*.RSA</exclude>
              </excludes>
            </filter>
          </filters>
          <relocations> <!-- renames the packages so that delta uses these instead of provided jars -->
            <relocation>
              <pattern>com.fasterxml.jackson</pattern>
              <shadedPattern>noc.com.fasterxml.jackson</shadedPattern>
            </relocation>
            <relocation><!-- optional -->
              <pattern>com.google.guava</pattern>
              <shadedPattern>noc.com.google.guava</shadedPattern>
            </relocation>
          </relocations>
        </configuration>
      </execution>
    </executions>
  </plugin>
...
</plugins>

Also make sure that your pom.xml has these dependencies in given order:

 <!-- jackson related dependencies of delta -->
<dependency>
  <groupId>com.fasterxml.jackson.core</groupId>
  <artifactId>jackson-core</artifactId>
  <version>2.6.7</version>
</dependency>
<dependency>
  <groupId>com.fasterxml.jackson.core</groupId>
  <artifactId>jackson-databind</artifactId>
  <version>2.6.7.1</version>
</dependency>
<dependency>
  <groupId>com.fasterxml.jackson.core</groupId>
  <artifactId>jackson-annotations</artifactId>
  <version>2.6.7</version>
</dependency>
<dependency>
  <groupId>com.fasterxml.jackson.module</groupId>
  <artifactId>jackson-module-scala_2.11</artifactId>
  <version>2.6.7.1</version>
</dependency>
<!-- /jackson related dependency of delta -->

<!-- delta -->
<!-- https://mvnrepository.com/artifact/io.delta/delta-core -->
<dependency>
  <groupId>io.delta</groupId>
  <artifactId>delta-core_2.11</artifactId>
  <version>0.6.1</version>
</dependency>
<!-- /delta -->