If you have followed this doc to setup Jupyter access by enabling Component Gateway, then you can access the Web Interfaces as described here. The trick is that this is included in the API response for the v1beta2
version.
Changes needed in the code are minimal (no additional requirements apart from google-cloud-dataproc
library). Just replace dataproc_v1
for dataproc_v1beta2
and access the endpoints with response.config.endpoint_config
:
from google.cloud import dataproc_v1beta2
project_id, cluster_name = '', ''
region = 'europe-west4'
client = dataproc_v1beta2.ClusterControllerClient(
client_options={
'api_endpoint': '{}-dataproc.googleapis.com:443'.format(region)
}
)
response = client.get_cluster(project_id, region, cluster_name)
print(response.config.endpoint_config)
In my case I get:
http_ports {
key: "HDFS NameNode"
value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/hdfs/dfshealth.html"
}
http_ports {
key: "Jupyter"
value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/jupyter/"
}
http_ports {
key: "JupyterLab"
value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/jupyter/lab/"
}
http_ports {
key: "MapReduce Job History"
value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/jobhistory/"
}
http_ports {
key: "Spark History Server"
value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/sparkhistory/"
}
http_ports {
key: "Tez"
value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/apphistory/tez-ui/"
}
http_ports {
key: "YARN Application Timeline"
value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/apphistory/"
}
http_ports {
key: "YARN ResourceManager"
value: "https://REDACTED-dot-europe-west4.dataproc.googleusercontent.com/yarn/"
}
enable_http_port_access: true