0
votes

My Requirement : From web application trigger a Spark job in Yarn and display the result back in web page. The spark job accepts few arguments and computes a DataSet with values that needs to be returned to web application.

After doing some browsing in the web , I figured Livy can be used for this.

Livy was already installed with HDP 2.5. So I created new Livy session using POST/Sessions and including my jar file.

{"kind":"spark","name":"livy","jars":["/xyz.jar"],"proxyUser":"livy"}

(I had to include header 'x-requested-by' as csrfPrevention was enabled.) Note:- the jar had to be placed in HDFS for this to work

As per Livy Examples :- https://livy.apache.org/examples/ I can pass code snippets as "data = {'code': '1 + 1'}" I don't understand how I can invoke the method in my class.I do not have 'className' option as per Livy Rest API Documentation - https://livy.apache.org/docs/latest/rest-api.html

If I use POST/Batch to create the session , I can specify a jar and my main class.But doing it this way I will not get my result back in my web application.

Java Code in my jar file :

public class LivySample {


    public String executeSampleLivy(SparkContext sc,String input){
        JavaSparkContext jsc = new JavaSparkContext(sc);
        List<String> listNames = Arrays.asList("abc","def","ghi");
        JavaRDD<String> rdd =  jsc.parallelize(listNames);
        return rdd.filter(l->l.contains(input)).collect().get(0);
    }

}

I tried to run the below code as POST on Livy url - sessions/20/statements '''

{
  "code": "import LivySample;LivySample lv = new LivySample();lv.executeSampleLivy(sc, \"abc\")"
}

Error I got while invoking GET sessions/21/statements/0:

  {
"id": 2,
"state": "available",
"output": {
"status": "error",
"execution_count": 2,
"ename": "Error",
"evalue": "<console>:1: error: '.' expected but ';' found. import LivySample;LivySample lv = new LivySample();lv.executeSampleLivy(sc, "chris"); ^",
"traceback": [],
}
}

I am not able to debug this error.Can you please let me know what I am doing wrong here.

Can I use Java in LivyRest API Like I have specified here.

Thanks!

1

1 Answers

0
votes

I'm more familiar with the batches API, but I believe in the session API your application JAR should be supplied in the files field of the request, not jars (paradoxically).

Anyway, a Livy session is basically an interactive spark-shell session. So if you wanted to use sessions, you would step through your program line-by-line (submitting a request to the RunStatement endpoint for each line). Then at the end you would ask the GetSessionStatement(s) endpoint for the result.

Alternatively (and perhaps more easily), you could use the batch API, just write the output to some persistent location, and have your web app expose it when the batch reaches "SUCCESS" state.