1
votes

I m trying to run Spark on Kubernetes as Scheduler.

It works fine when running from outside of kubernetes cluster using kubectl proxy.

spark-shell --master k8s://http://localhost:8001 --conf spark.kubernetes.container.image=abdoumediaoptimise/spark

But whenever we try running spark-shell or spark-submit from within a pod directly , it never works (even by following rbac from spark docs with : --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark . We have authorization execution exception:

io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: GET at: https://kubernetes/api/v1/namespaces/default/pods?labelSelector=spark-app-selector%3Dspark-application-1574714537374,spark-role%3Dexecutor. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods is forbidden: User "system:serviceaccount:default:default" cannot list resource "pods" in API group "" in the namespace "default"

Any idea how to Launch Spark from within pods ? this actually makes using spark k8s:// with notebooks impossible

Spark RBAC YAML file

apiVersion: v1
kind: ServiceAccount
metadata:
  name:  spark
  namespace: default
---
apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: spark
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: edit
subjects:
  - kind: ServiceAccount
    name: spark
    namespace: default
1
Make sure you ran the following commands in the documentation against your kubectl to have created the required service account and cluster role binding. kubectl create serviceaccount spark and kubectl create clusterrolebinding spark-role --clusterrole=edit --serviceaccount=default:spark --namespace=defaultPhilippe Haussmann
@PhilippeHaussmann I did, but still not workingAbba
Could you share your yaml manifests of the appropriate clusterrolebinding and serviceaccount ?mario
@mario I applied with kubectl instructions from Philippe aboveAbba
Do you get the same error message after applying it ? The message you quoted points out the cause of the issue quite precisely and is rather self-explanatory.mario

1 Answers

0
votes

spark.kubernetes.authenticate.driver.serviceAccountName - is the ServiceAccount name which Spark Driver's Kubernetes client uses to authenticate to Kubernetes API to request executors.

You are looking for spark.kubernetes.authenticate.submission.*, which are used to configure Kubernetes client of SparkSubmit application to authenticate to the Kubernetes API to request Service, ConfigMap and Driver Pod.

To make it work configure your Pod with the ServiceAccount of interest: spec.serviceAccountName: <your-SA>. After that use the mounted to the /var/run/secrets/kubernetes.io/serviceaccount directory files inside a Pod to configure the spark.kubernetes.authenticate.submission.* options.