1. Access the Spark API
a. Driver node (internal) access the Azure Databricks Spark api:
import requests
driverIp = spark.conf.get('spark.driver.host')
port = spark.conf.get("spark.ui.port")
url = F"http://{driverIp}:{port}/api/v1/applications"
r = requests.get(url, timeout=3.0)
r.status_code, r.text
If for example you received this error message from public API:
PERMISSION_DENIED: Traffic on this port is not permitted
b. External Access to the Azure Databricks Spark API:
import requests
import json
"""
Program access to Databricks Spark UI.
Works external to Databricks environment or running within.
Requires a Personal Access Token. Treat this like a password, do not store in a notebook. Please refer to the Secrets API.
This Python code requires F string support.
"""
# https://<databricks-host>/driver-proxy-api/o/0/<cluster_id>/<port>/api/v1/applications/<application-id-from-master-spark-ui>/stages/<stage-id>
port = spark.conf.get("spark.ui.port")
clusterId = spark.conf.get("spark.databricks.clusterUsageTags.clusterId")
host = "eastus2.azuredatabricks.net"
workspaceId = "999999999999111" # follows the 'o=' in the databricks URLs or zero
token = "dapideedeadbeefdeadbeefdeadbeef68ee3" # Personal Access token
url = F"https://{host}/driver-proxy-api/o/{workspaceId}/{clusterId}/{port}/api/v1/applications/?status=running"
r = requests.get(url, auth=("token", token))
# print Application list response
print(r.status_code, r.text)
applicationId = r.json()[0]['id'] # assumes only one response
url = F"https://{host}/driver-proxy-api/o/{workspaceId}/{clusterId}/{port}/api/v1/applications/{applicationId}/jobs"
r = requests.get(url, auth=("token", token))
print(r.status_code, r.json())
2. Sorry, no, not at this time.
The cluster logs would be where you'd look, but the user identity is not there.
To vote and track this idea: https://ideas.databricks.com/ideas/DBE-I-313
How to get to the Ideas portal: https://docs.databricks.com/ideas.html