We are using a custom spark receiver that reads streamed data from a provided http link. If the provided http link is incorrect, the receiver fails. The problem is that spark will continuously restart the receiver, and the application will never terminate. The question is how to tell Spark to terminate the application if the receiver fails.
This is an extract of our custom receiver:
def onStart() {
// Start the thread that receives data over a connection
new Thread("Receiver") {
override def run() { receive() }
}.start()
}
private def receive(): Unit = {
....
val response: CloseableHttpResponse = httpclient.execute(req)
try {
val sl = response.getStatusLine()
if (sl.getStatusCode != 200){
val errorMsg = "Error: " + sl.getStatusCode
val thrw = new RuntimeException(errorMsg)
stop(errorMsg, thrw)
} else {
...
store(doc)
}
We have a spark streaming application that uses this receiver:
val ssc = new StreamingContext(sparkConf, duration)
val changes = ssc.receiverStream(new CustomReceiver(...
...
ssc.start()
ssc.awaitTermination()
Everything works as expected if the receiver doesn't have errors. If the receiver fails (e.g. with a wrong http link), spark will continuously restart it and the application will never terminate.
16/05/31 17:03:38 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job
16/05/31 17:03:38 ERROR ReceiverTracker: Receiver has been stopped. Try to restart it.
We just want to terminate the whole application if a receiver fails.