Let's assume I have a website with a form where user can past some values. Now I want to take these values, process them with Spark Streaming and return the result back to the user. Something like this:
Detailed setup does not really matter - the Spark Streaming can be doing some recommendation or prediction and could sit on top of Databricks; backend can be a Flask application...
My questions are:
- How to tell the Website Backend server that Spark Streaming processed the input data and output the results somewhere?
- Which pieces this pipeline misses? Some intermediate DB such as Redis/Mongo/SQL? Some message broker such as Kafka?
I can't get my head around the part where the Spark Streaming sends info back to the Website backend. If I send the result of Spark Streaming processing to database (Mongo, Redis, MySQL), filesystem (S3, Blob, HDFS) or message broker (Kafka, Kinesis), how to tell the Website backend about it?
