3
votes

I am trying to submit spark job via livy using rest api. But if I run same script multiple time it runs multiple instance of a job with different job ID's. I am looking a way to kill spark/yarn job running with same name before starting a new one. Livy document says (https://github.com/cloudera/livy#batch) delete the batch job, but livy sessions doesn't return application name, just application id is returned.

Is there another way to do this ?

3

3 Answers

1
votes

You can use LivyClient API to submit spark jobs using Livy Server. LivyClient API has a stop method which can used to kill the job.

LivyClient.close(true);

1
votes

For Livy version 0.7.0, the following works.

Where the Session ID you want to stop is 1:

  • python
import requests
headers = {'Content-Type': 'application/json'}
session_url = 'http://your-livy-server-ip:8998/sessions/1'
requests.delete(session_url, headers=headers)
  • shell
curl -X DELETE http://your-livy-server-ip:8998/sessions/1

See https://livy.incubator.apache.org/docs/latest/rest-api.html

0
votes

Sessions that were active when the Livy server was stopped may need to be killed manually. Use the tools from your cluster manager to achieve that (for example, the yarn command line tool).

Run the following command to find the application IDs of the interactive jobs started through Livy.

yarn application -list

Run the following command to kill those jobs.

yarn application –kill "Application ID"

Refer: “https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-apache-spark-known-issues#livy-leaks-interactive-session”.