Realtime request-based recommendations with Spark - Spark JobServer?

Question

We are trying to find a way to load a Spark (2.x) ML trained model so that on request (through a REST interface) we can query it and get the predictions, e.g. http://predictor.com:8080/give/me/predictions?a=1,b=2,c=3

There are libs out-of-box to load a model into Spark (given it was stored somewhere after training using MLWritable) and then use it for predictions, but it seems overkill to wrap it in a job and run this per request/call due to SparkContext's initialization.

However, using Spark has the advantage that we can save our Pipeline model and perform the same feature transformations without having to implement it outside of the SparkContext.

After some digging, we found that spark-job-server can potentially help us with this problem by allowing us to have a "hot" spark-context initialized for the job-server and hence, we can then serve the requests by calling the prediction job (and getting the results) within the existent context using the spark-job-server's REST API.

Is this the best approach to API-ify a prediction? Due to the feature space we cannot pre-predict all combinations.

Alternatively we were thinking about using Spark Streaming and persisting the predictions to a message queue. This allows us to not use spark-job-server but it doesn't simplify the overall flow. Has anyone tried a similar approach?

We've recently tried to use jobserver to solve a similar problem of executing Spark jobs on demand. Although it is nice, it is far from being a production grade ready to ship product. You have to do a lot of tweaks manually, support for Spark 2.x is in preview, and deploying it requires work. If you're ready to put in substantial amount of work, go ahead. We ended going with a solution based on Sparks undocumented REST API. — Yuval Itzchakov
Would it even respond in decent time (sub 0.1 sec) ? In my experience, ML pipelines are really slow, due to the various steps in their computation, like converting schemas, type checks, and most importantly some kind of model/matrix broadcast at least on NaiveBayes, W2V and some others I have used. (the cost is amortized when you have tons of predictions to make, but their setup is prohibitive in a single prediction case). In any way, I don't see spark ML pipes performing anywhere near sub second. Have you achieved otherwise ? — GPI

Garren S Garren S · Accepted Answer · 2017-03-04T05:01:34

Another option could be cloudera's livy (http://livy.io/ | https://github.com/cloudera/livy#rest-api) which allows for session caching, interactive queries, batch jobs and more. I've used it and found it very promising.

Realtime request-based recommendations with Spark - Spark JobServer?

2 Answers