2
votes

The kubernetes master in one of my GKE clusters became unresponsive last night following the infrastructure issue in us-central1-a.

Whenever I run "kubectl get pods" in the default namespace I get the following error message: Error from server: an error on the server has prevented the request from succeeding

If I run "kubectl get pods --namespace=kube-system", I only see the kube-proxy and the fluentd-logging daemon.

I have trying scaling the cluster down to 0 and then scaling it back up. I have also tried downgrading and upgrading the cluster but that seems to apply only to the nodes (not the master). Is there any GKE/K8S API command to issue a restart to the kubernetes master?

2
Did you try kube-down.sh?Tom K.
No. I am not sure how to invoke a shell script on the kubernetes master, since I only have access via the command line tools.kgx
you don't have access to execute kube-down.sh on managed GKEOffenso

2 Answers

4
votes

There is not a command that will allow you to restart the Kubernetes master in GKE (since the master is considered a part of the managed service). There is automated infrastructure (and then an oncall engineer from Google) that is responsible for restarting the master if it is unhealthy.

In this particular cases, restarting the master had no effect on restoring it to normal behavior because Google Compute Engine Incident #16011 caused an outage on 2016-06-28 for GKE masters running in us-central1-a (even though that isn't indicated on the Google Cloud Status Dashboard). During the incident, many masters were unavailable.

If you had tried to create a GCE cluster using kube-up.sh during that time, you would have similarly seen that it would be unable to create a functional master VM due to the SSD Persistent disk latency issues.

1
votes

I'm trying to have at least one version to upgrade ready, if you trying to upgrade the master, it will restart and work within few minutes. Otherwise you should wait around 3 days while Google team will reboot it. On e-mail/phone, then won't help you. And unless you have payed support (transition to which taking few days), they won't give a bird.