9
votes

I'm running a Spark Streaming application on YARN in cluster mode and I'm trying to implement a gracefully shutdown so that when the application is killed it will finish the execution of the current micro batch before stopping.

Following some tutorials I have configured spark.streaming.stopGracefullyOnShutdown to true and I've added the following code to my application:

sys.ShutdownHookThread {
   log.info("Gracefully stopping Spark Streaming Application")  
   ssc.stop(true, true)
   log.info("Application stopped")
}

However when I kill the application with

yarn application -kill application_1454432703118_3558

the micro batch executed at that moment is not completed.

In the driver I see the first line of log printed ("Gracefully stopping Spark Streaming Application") but not the last one ("Application stopped").

ERROR yarn.ApplicationMaster: RECEIVED SIGNAL 15: SIGTERM
INFO streaming.MySparkJob: Gracefully stopping Spark Streaming Application
INFO scheduler.JobGenerator: Stopping JobGenerator gracefully
INFO scheduler.JobGenerator: Waiting for all received blocks to be consumed for job generation
INFO scheduler.JobGenerator: Waited for all received blocks to be consumed for job generation
INFO streaming.StreamingContext: Invoking stop(stopGracefully=true) from shutdown hook

In the executors log I see the following error:

ERROR executor.CoarseGrainedExecutorBackend: Driver 192.168.6.21:49767 disassociated! Shutting down.
INFO storage.DiskBlockManager: Shutdown hook called
WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://[email protected]:49767] has failed, address is now gated for [5000] ms. Reason: [Disassociated]
INFO util.ShutdownHookManager: Shutdown hook called

I think the problem is related to how YARN send the kill signal the application. Any idea on how can I make the application stop gracefully?

2
were you able to solve it ?Gaurav Shah
Nope, unfortunately not.nicola

2 Answers

2
votes

you should go to the executors page to see where your driver is running ( on which node). ssh to that node and do the following:

ps -ef | grep 'app_name'

(replace app_name with your classname/appname). it will list couple of processes. Look at the process, some will be child of the other. Pick the id of the parent-most process and send a SIGTERM

kill pid

after some time you'll see that your application has terminated gracefully.

Also now you don't need to add those hooks for shutdown. use spark.streaming.stopGracefullyOnShutdown config to help shutdown gracefully

1
votes

You can stop spark streaming application by invoking ssc.stop when a customized condition is triggered instead of using awaitTermination. As the following pseudocode shows:

ssc.start()
while True:
    time.sleep(10s)
    if some_file_exist:
        ssc.stop(True, True)