1
votes

I have an application, a REST server built on Netty that embeds spark sql, and hivecontext to do analytical queries. Everything works fine on IntelliJ when running the service. But I build an uber jar that contains tha whole thing. And can't get it to run, because Hive can't instantiate its MetaStoreclient. After digging it seems that hive can't resolve the datanucleus dependencies. I run my application as

java -jar app.jar

I have tried to add Datanucleus jars with java -cp ... with no luck. The Spark doc recommends running this with --jars flags but still no luck. Since I guess I'm not using spark-submit here.

Any help is very much appreciated. Thanks.

Edit : To answer the question below, Yes I am initiating Spark in local mode for now as master = local[*]. There's a hive-site.xml in $SPARK_HOME/conf/. When run in IntelliJ it works fine, hive creates a local metastore on the project directory, spits its log to derby.log. The issue seems to happen when starting the web server in a shaded jar where the SparkContext and HiveContext are instantiated.

1
Are you initiating the Spark context as a standalone app (no cluster)? Is there a hive-site.xml file in your app?ra2085

1 Answers

1
votes

So I managed to solve the issue. since I was using the maven shade plugin, I needed to add the datanucleus jars into the classpath

  <transformer  implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
    <manifestEntries>
      <Main-Class>com.sparkserver.SparkServer</Main-Class>
      <Class-Path>..libs/mysql-connector-java-5.1.36.jar ../libs/datanucleus-core-3.2.10.jar ../libs/datanucleus-api-jdo-3.2.6.jar ../libs/datanucleus-rdbms-3.2.9.jar ../libs/bonecp-0.8.0.RELEASE.jar
     </Class-Path>
    </manifestEntries>
  </transformer>

Since using -jar erases the usual classpath, i added these lines with the matching versions in $SPARK_HOME/libs, and it worked fine.