0
votes

I am trying to expose a Spark ML code written in Java as a REST service on Weblogic. But, while trying to initialize the Spark Context, the code throws InvocationTargetException from within the org.apache.hadoop.security packages.

Following are the steps I have used:

  1. Trained a Spark CrossValidator Naive Bayes model and persisted it in parquet file format. (one time activity)
  2. In a Java Web Project, my java source file has a method that loads up the model and uses it to predict the class label for a raw text input [This works standalone, when called from the main() method]
  3. A second Java class is exposed as a REST service, and deployed to Weblogic along with the JAX-RS Jersey 2.x library (that ships with JDeveloper 12.2.1.1) plus all the jars from the spark 2.0 distribution.
  4. While invoking the REST service, it errors out because of the following error.

Upon analyzing, I found that the code is failing while trying to initialize the SparkContext, which is done in the following way in my Java code:

SparkConf conf = new SparkConf().setAppName("IssuePredictor").setMaster("local").set("spark.sql.warehouse.dir", "spark-warehouse");
SparkContext sc = new SparkContext(conf);

I have already tried various ways to refer to the spark-warehouse, including absolute path of the folder or using a relative path like the one specified here. But nothing worked.

Error Trace:

Root cause of ServletException. java.lang.RuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:134) at org.apache.hadoop.security.Groups.(Groups.java:79) at org.apache.hadoop.security.Groups.(Groups.java:74) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:303) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260) at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:790) at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:760) at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:633) at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2245) at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2245) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2245) at org.apache.spark.SparkContext.(SparkContext.scala:297) at myapps.ml.spark.IssuePredictor.predict(IssuePredictor.java:78)

...

Caused By: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:132) at org.apache.hadoop.security.Groups.(Groups.java:79) at org.apache.hadoop.security.Groups.(Groups.java:74) at org.apache.hadoop.security.Groups.getUserToGroupsMappingService(Groups.java:303) at org.apache.hadoop.security.UserGroupInformation.initialize(UserGroupInformation.java:283) at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:260) at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:790) at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:760) at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:633) at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2245) at org.apache.spark.util.Utils$$anonfun$getCurrentUserName$1.apply(Utils.scala:2245) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.util.Utils$.getCurrentUserName(Utils.scala:2245) at org.apache.spark.SparkContext.(SparkContext.scala:297) at myapps.ml.spark.IssuePredictor.predict(IssuePredictor.java:78)

...

Caused By: java.lang.StackOverflowError at org.slf4j.impl.JDK14LoggerAdapter.log(JDK14LoggerAdapter.java:659) at org.slf4j.bridge.SLF4JBridgeHandler.callLocationAwareLogger(SLF4JBridgeHandler.java:221) at org.slf4j.bridge.SLF4JBridgeHandler.publish(SLF4JBridgeHandler.java:303) at java.util.logging.Logger.log(Logger.java:738)

...

Appreciate any help.

Thanks, Bhaskar

1

1 Answers

0
votes

I think your problem is this. I.e. You have both jul-to-slf4j.jar and slf4j-jdk14.jar present in your class path.

Looking at your stack traces, Spark is trying to figure out the user and its group mapping when creating your spark context. For this it will need to instantiate (the ReflectionUtils.newInstance call in the stack) ShellBasedUnixGroupsMapping class. In ShellBasedUnixGroupsMapping there's this static call

private static final Log LOG =
    LogFactory.getLog(ShellBasedUnixGroupsMapping.class);

because of which you're getting into the above mentioned infinite recursion and end up in stack overflow error.

Hope this helps.