issue on arm64: no endpoints,code:503

Question

Kubernetes version (use kubectl version):

Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.4", GitCommit:"d6f433224538d4f9ca2f7ae19b252e6fcb66a3ae", GitTreeState:"clean", BuildDate:"2017-05-19T18:44:27Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/arm64"}
Server Version: version.Info{Major:"1", Minor:"5", GitVersion:"v1.5.2", GitCommit:"08e099554f3c31f6e6f07b448ab3ed78d0520507", GitTreeState:"clean", BuildDate:"2017-01-12T04:52:34Z", GoVersion:"go1.7.4", Compiler:"gc", Platform:"linux/arm64"}

Environment:

    OS (e.g. from /etc/os-release):
    NAME="Ubuntu"
    VERSION="16.04.2 LTS (Xenial Xerus)"
    ID=ubuntu
    ID_LIKE=debian
    PRETTY_NAME="Ubuntu 16.04.2 LTS"
    VERSION_ID="16.04"
    HOME_URL="http://www.ubuntu.com/"
    SUPPORT_URL="http://help.ubuntu.com/"
    BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
    VERSION_CODENAME=xenial
    UBUNTU_CODENAME=xenial

Kernel (e.g. uname -a):

    Linux node4 4.11.0-rc6-next-20170411-00286-gcc55807 #0 SMP PREEMPT Mon Jun 5 18:56:20 CST 2017 aarch64 aarch64 aarch64 GNU/Linux

What happened:

    I want to use kube-deploy/master.sh to setup master on ARM64, but I encountered the error when visiting $myip:8080/ui:

    {
    "kind": "Status",
    "apiVersion": "v1",
    "metadata": {},
    "status": "Failure",
    "message": "no endpoints available for service "kubernetes-dashboard"",
    "reason": "ServiceUnavailable",
    "code": 503
    }
    My branch is 2017-2-7 (c8d6fbfc…)
    by the way, It can work on X86-amd64 platform by using the same steps to install.

Anything else we need to know:

5.1 kubectl get pod --namespace=kube-system

        k8s-master-10.193.20.23 4/4 Running 17 1h
        k8s-proxy-v1-sk8vd 1/1 Running 0 1h
        kube-addon-manager-10.193.20.23 2/2 Running 2 1h
        kube-dns-3365905565-xvj7n 2/4 CrashLoopBackOff 65 1h
        kubernetes-dashboard-1416335539-lhlhz 0/1 CrashLoopBackOff 22 1h

5.2 kubectl describe pods kubernetes-dashboard-1416335539-lhlhz --namespace=kube-system

        Name:   kubernetes-dashboard-1416335539-lhlhz
        Namespace:  kube-system
        Node:   10.193.20.23/10.193.20.23
        Start Time: Mon, 12 Jun 2017 10:04:07 +0800
        Labels: k8s-app=kubernetes-dashboard
        pod-template-hash=1416335539
        Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"kube-system","name":"kubernetes-dashboard-1416335539","uid":"6ab170d2-4f13-11e7-a...
        scheduler.alpha.kubernetes.io/critical-pod=
        scheduler.alpha.kubernetes.io/tolerations=[{"key":"CriticalAddonsOnly", "operator":"Exists"}]
        Status: Running
        IP: 10.1.70.2
        Controllers:    ReplicaSet/kubernetes-dashboard-1416335539
        Containers:
        kubernetes-dashboard:
        Container ID:   docker://fbdbe4c047803b0e98ca7412ca617031f1f31d881e3a5838298a1fda24a1ae18
        Image:  gcr.io/google_containers/kubernetes-dashboard-arm64:v1.5.0
        Image ID:   docker-pullable://gcr.io/google_containers/kubernetes-dashboard-arm64@sha256:559d58ef0d8e9dbe78f80060401b97d6262462318c0b8e071937a73896ea1d3d
        Port:   9090/TCP
        State:  Running
        Started:    Mon, 12 Jun 2017 11:30:03 +0800
        Last State: Terminated
        Reason: Error
        Exit Code:  1
        Started:    Mon, 12 Jun 2017 11:24:28 +0800
        Finished:   Mon, 12 Jun 2017 11:24:59 +0800
        Ready:  True
        Restart Count:  23
        Limits:
        cpu:    100m
        memory: 50Mi
        Requests:
        cpu:    100m
        memory: 50Mi
        Liveness:   http-get http://:9090/ delay=30s timeout=30s period=10s #success=1 #failure=3
        Environment:    
        Mounts:
        /var/run/secrets/kubernetes.io/serviceaccount from default-token-0mnn8 (ro)
        Conditions:
        Type    Status
        Initialized True
        Ready True
        PodScheduled True
        Volumes:
        default-token-0mnn8:
        Type:   Secret (a volume populated by a Secret)
        SecretName: default-token-0mnn8
        Optional:   false
        QoS Class:  Guaranteed
        Node-Selectors: 
        Tolerations:    
        Events:
        FirstSeen   LastSeen    Count   From    SubObjectPath   Type    Reason  Message

        30m 30m 1   kubelet, 10.193.20.23   spec.containers{kubernetes-dashboard}   Normal  Killing Killing container with docker id b0562b3640ae: pod "kubernetes-dashboard-1416335539-lhlhz_kube-system(6ab54dba-4f13-11e7-a56b-6805ca369d7f)" container "kubernetes-dashboard" is unhealthy, it will be killed and re-created.
        18m 18m 1   kubelet, 10.193.20.23   spec.containers{kubernetes-dashboard}   Normal  Killing Killing container with docker id 477066c3a00f: pod "kubernetes-dashboard-1416335539-lhlhz_kube-system(6ab54dba-4f13-11e7-a56b-6805ca369d7f)" container "kubernetes-dashboard" is unhealthy, it will be killed and re-created.
        12m 12m 1   kubelet, 10.193.20.23   spec.containers{kubernetes-dashboard}   Normal  Killing Killing container with docker id 3e021d6df31f: pod "kubernetes-dashboard-1416335539-lhlhz_kube-system(6ab54dba-4f13-11e7-a56b-6805ca369d7f)" container "kubernetes-dashboard" is unhealthy, it will be killed and re-created.
        11m 11m 1   kubelet, 10.193.20.23   spec.containers{kubernetes-dashboard}   Normal  Killing Killing container with docker id 43fe3c37817d: pod "kubernetes-dashboard-1416335539-lhlhz_kube-system(6ab54dba-4f13-11e7-a56b-6805ca369d7f)" container "kubernetes-dashboard" is unhealthy, it will be killed and re-created.
        5m  5m  1   kubelet, 10.193.20.23   spec.containers{kubernetes-dashboard}   Normal  Killing Killing container with docker id 23cea72e1f45: pod "kubernetes-dashboard-1416335539-lhlhz_kube-system(6ab54dba-4f13-11e7-a56b-6805ca369d7f)" container "kubernetes-dashboard" is unhealthy, it will be killed and re-created.
        1h  5m  7   kubelet, 10.193.20.23   spec.containers{kubernetes-dashboard}   Warning Unhealthy   Liveness probe failed: Get http://10.1.70.2:9090/: dial tcp 10.1.70.2:9090: getsockopt: connection refused
        1h  38s 335 kubelet, 10.193.20.23   spec.containers{kubernetes-dashboard}   Warning BackOff Back-off restarting failed docker container
        1h  38s 307 kubelet, 10.193.20.23   Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "kubernetes-dashboard" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kubernetes-dashboard pod=kubernetes-dashboard-1416335539-lhlhz_kube-system(6ab54dba-4f13-11e7-a56b-6805ca369d7f)"

        1h  27s 24  kubelet, 10.193.20.23   spec.containers{kubernetes-dashboard}   Normal  Pulled  Container image "gcr.io/google_containers/kubernetes-dashboard-arm64:v1.5.0" already present on machine
        59m 23s 15  kubelet, 10.193.20.23   spec.containers{kubernetes-dashboard}   Normal  Created (events with common reason combined)
        59m 22s 15  kubelet, 10.193.20.23   spec.containers{kubernetes-dashboard}   Normal  Started (events with common reason combined)

5.3 kubectl get svc,ep,rc,rs,deploy,pod -o wide --all-namespaces NAMESPACE NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR default svc/kubernetes 10.0.0.1 443/TCP 16m kube-system svc/kube-dns 10.0.0.10 53/UDP,53/TCP 16m k8s-app=kube-dns kube-system svc/kubernetes-dashboard 10.0.0.95 80/TCP 16m k8s-app=kubernetes-dashboard

    NAMESPACE     NAME                         ENDPOINTS           AGE
    default       ep/kubernetes                10.193.20.23:6443   16m
    kube-system   ep/kube-controller-manager   <none>              11m
    kube-system   ep/kube-dns                                      16m
    kube-system   ep/kube-scheduler            <none>              11m
    kube-system   ep/kubernetes-dashboard                          16m

    NAMESPACE     NAME                                 DESIRED   CURRENT   READY     AGE       CONTAINER(S)                              IMAGE(S)                                                                                                                                                                                       SELECTOR
    kube-system   rs/kube-dns-3365905565               1         1         0         16m       kubedns,dnsmasq,dnsmasq-metrics,healthz   gcr.io/google_containers/kubedns-arm64:1.9,gcr.io/google_containers/kube-dnsmasq-arm64:1.4,gcr.io/google_containers/dnsmasq-metrics-arm64:1.0,gcr.io/google_containers/exechealthz-arm64:1.2   k8s-app=kube-dns,pod-template-hash=3365905565
    kube-system   rs/kubernetes-dashboard-1416335539   1         1         0         16m       kubernetes-dashboard                      gcr.io/google_containers/kubernetes-dashboard-arm64:v1.5.0                                                                                                                                     k8s-app=kubernetes-dashboard,pod-template-hash=1416335539

    NAMESPACE     NAME                          DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE       CONTAINER(S)                              IMAGE(S)                                                                                                                                                                                       SELECTOR
    kube-system   deploy/kube-dns               1         1         1            0           16m       kubedns,dnsmasq,dnsmasq-metrics,healthz   gcr.io/google_containers/kubedns-arm64:1.9,gcr.io/google_containers/kube-dnsmasq-arm64:1.4,gcr.io/google_containers/dnsmasq-metrics-arm64:1.0,gcr.io/google_containers/exechealthz-arm64:1.2   k8s-app=kube-dns
    kube-system   deploy/kubernetes-dashboard   1         1         1            0           16m       kubernetes-dashboard                      gcr.io/google_containers/kubernetes-dashboard-arm64:v1.5.0                                                                                                                                     k8s-app=kubernetes-dashboard

    NAMESPACE     NAME                                       READY     STATUS             RESTARTS   AGE       IP             NODE
    kube-system   po/k8s-master-10.193.20.23                 4/4       Running            50         15m       10.193.20.23   10.193.20.23
    kube-system   po/k8s-proxy-v1-5b831                      1/1       Running            0          16m       10.193.20.23   10.193.20.23
    kube-system   po/kube-addon-manager-10.193.20.23         2/2       Running            6          15m       10.193.20.23   10.193.20.23
    kube-system   po/kube-dns-3365905565-jxg4f               1/4       CrashLoopBackOff   20         16m       10.1.5.3       10.193.20.23
    kube-system   po/kubernetes-dashboard-1416335539-frt3v   0/1       CrashLoopBackOff   7          16m       10.1.5.2       10.193.20.23



 5.4 kubectl describe pods kube-dns-3365905565-lb0mq --namespace=kube-system
Name:       kube-dns-3365905565-lb0mq
Namespace:  kube-system
Node:       10.193.20.23/10.193.20.23
Start Time: Wed, 14 Jun 2017 10:43:46 +0800
Labels:     k8s-app=kube-dns
        pod-template-hash=3365905565
Annotations:    kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"kube-system","name":"kube-dns-3365905565","uid":"4870aec2-50ab-11e7-a420-6805ca36...
        scheduler.alpha.kubernetes.io/critical-pod=
        scheduler.alpha.kubernetes.io/tolerations=[{"key":"CriticalAddonsOnly", "operator":"Exists"}]
Status:     Running
IP:     10.1.20.3
Controllers:    ReplicaSet/kube-dns-3365905565
Containers:
  kubedns:
    Container ID:   docker://729562769b48be60a02b62692acd3d1e1c67ac2505f4cb41240067777f45fd77
    Image:      gcr.io/google_containers/kubedns-arm64:1.9
    Image ID:       docker-pullable://gcr.io/google_containers/kubedns-arm64@sha256:3c78a2c5b9b86c5aeacf9f5967f206dcf1e64362f3e7f274c1c078c954ecae38
    Ports:      10053/UDP, 10053/TCP, 10055/TCP
    Args:
      --domain=cluster.local.
      --dns-port=10053
      --config-map=kube-dns
      --v=0
    State:      Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Wed, 14 Jun 2017 10:56:29 +0800
      Finished:     Wed, 14 Jun 2017 10:58:06 +0800
    Ready:      False
    Restart Count:  6
    Limits:
      memory:   170Mi
    Requests:
      cpu:  100m
      memory:   70Mi
    Liveness:   http-get http://:8080/healthz-kubedns delay=60s timeout=5s period=10s #success=1 #failure=5
    Readiness:  http-get http://:8081/readiness delay=3s timeout=5s period=10s #success=1 #failure=3
    Environment:
      PROMETHEUS_PORT:  10055
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-1t5v9 (ro)
  dnsmasq:
    Container ID:   docker://b6d7e98a4af2715294764929f901947ab3b985be45d9f213245bd338ab8c3101
    Image:      gcr.io/google_containers/kube-dnsmasq-arm64:1.4
    Image ID:       docker-pullable://gcr.io/google_containers/kube-dnsmasq-arm64@sha256:dff5f9e2a521816aa314d469fd8ef961270fe43b4a74bab490385942103f3728
    Ports:      53/UDP, 53/TCP
    Args:
      --cache-size=1000
      --no-resolv
      --server=127.0.0.1#10053
      --log-facility=-
    State:      Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Error
      Exit Code:    137
      Started:      Wed, 14 Jun 2017 10:55:50 +0800
      Finished:     Wed, 14 Jun 2017 10:57:26 +0800
    Ready:      False
    Restart Count:  6
    Requests:
      cpu:      150m
      memory:       10Mi
    Liveness:       http-get http://:8080/healthz-dnsmasq delay=60s timeout=5s period=10s #success=1 #failure=5
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-1t5v9 (ro)
  dnsmasq-metrics:
    Container ID:   docker://51693aea0e732e488b631dcedc082f5a9e23b5b74857217cf005d1e947375367
    Image:      gcr.io/google_containers/dnsmasq-metrics-arm64:1.0
    Image ID:       docker-pullable://gcr.io/google_containers/dnsmasq-metrics-arm64@sha256:fc0e8b676a26ed0056b8c68611b74b9b5f3f00c608e5b11ef1608484ce55dd9a
    Port:       10054/TCP
    Args:
      --v=2
      --logtostderr
    State:      Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       ContainerCannotRun
      Exit Code:    128
      Started:      Wed, 14 Jun 2017 10:57:28 +0800
      Finished:     Wed, 14 Jun 2017 10:57:28 +0800
    Ready:      False
    Restart Count:  7
    Requests:
      memory:       10Mi
    Liveness:       http-get http://:10054/metrics delay=60s timeout=5s period=10s #success=1 #failure=5
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-1t5v9 (ro)
  healthz:
    Container ID:   docker://fab7ef143a95ad4d2f6363d5fcdc162eba1522b92726665916462be765289327
    Image:      gcr.io/google_containers/exechealthz-arm64:1.2
    Image ID:       docker-pullable://gcr.io/google_containers/exechealthz-arm64@sha256:e8300fde6c36b454cc00b5fffc96d6985622db4d8eb42a9f98f24873e9535b5c
    Port:       8080/TCP
    Args:
      --cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1 >/dev/null
      --url=/healthz-dnsmasq
      --cmd=nslookup kubernetes.default.svc.cluster.local 127.0.0.1:10053 >/dev/null
      --url=/healthz-kubedns
      --port=8080
      --quiet
    State:      Running
      Started:      Wed, 14 Jun 2017 10:44:31 +0800
    Ready:      True
    Restart Count:  0
    Limits:
      memory:   50Mi
    Requests:
      cpu:      10m
      memory:       50Mi
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from default-token-1t5v9 (ro)
Conditions:
  Type      Status
  Initialized   True 
  Ready     False 
  PodScheduled  True 
Volumes:
  default-token-1t5v9:
    Type:   Secret (a volume populated by a Secret)
    SecretName: default-token-1t5v9
    Optional:   false
QoS Class:  Burstable
Node-Selectors: <none>
Tolerations:    <none>
Events:
  FirstSeen LastSeen    Count   From            SubObjectPath               Type        Reason      Message
  --------- --------    -----   ----            -------------               --------    ------      -------
  15m       15m     1   default-scheduler                       Normal      Scheduled   Successfully assigned kube-dns-3365905565-lb0mq to 10.193.20.23
  14m       14m     1   kubelet, 10.193.20.23   spec.containers{kubedns}        Normal      Created     Created container with docker id 2fef2db445e6; Security:[seccomp=unconfined]
  14m       14m     1   kubelet, 10.193.20.23   spec.containers{kubedns}        Normal      Started     Started container with docker id 2fef2db445e6
  14m       14m     1   kubelet, 10.193.20.23   spec.containers{dnsmasq}        Normal      Created     Created container with docker id 41ec998eeb76; Security:[seccomp=unconfined]
  14m       14m     1   kubelet, 10.193.20.23   spec.containers{dnsmasq}        Normal      Started     Started container with docker id 41ec998eeb76
  14m       14m     1   kubelet, 10.193.20.23   spec.containers{dnsmasq-metrics}    Normal      Created     Created container with docker id 676ef0e877c8; Security:[seccomp=unconfined]
  14m       14m     1   kubelet, 10.193.20.23   spec.containers{healthz}        Normal      Pulled      Container image "gcr.io/google_containers/exechealthz-arm64:1.2" already present on machine
  14m       14m     1   kubelet, 10.193.20.23   spec.containers{dnsmasq-metrics}    Warning     Failed      Failed to start container with docker id 676ef0e877c8 with error: Error response from daemon: {"message":"linux spec user: unable to find group nobody: no matching entries in group file"}
  14m       14m     1   kubelet, 10.193.20.23   spec.containers{healthz}        Normal      Created     Created container with docker id fab7ef143a95; Security:[seccomp=unconfined]
  14m       14m     1   kubelet, 10.193.20.23   spec.containers{healthz}        Normal      Started     Started container with docker id fab7ef143a95
  14m       14m     1   kubelet, 10.193.20.23   spec.containers{dnsmasq-metrics}    Warning     Failed      Failed to start container with docker id 45f6bd7f1f3a with error: Error response from daemon: {"message":"linux spec user: unable to find group nobody: no matching entries in group file"}
  14m       14m     1   kubelet, 10.193.20.23   spec.containers{dnsmasq-metrics}    Normal      Created     Created container with docker id 45f6bd7f1f3a; Security:[seccomp=unconfined]
  14m       14m     1   kubelet, 10.193.20.23                       Warning     FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "dnsmasq-metrics" with CrashLoopBackOff: "Back-off 10s restarting failed container=dnsmasq-metrics pod=kube-dns-3365905565-lb0mq_kube-system(48845c1a-50ab-11e7-a420-6805ca369d7f)"

  14m   14m 1   kubelet, 10.193.20.23   spec.containers{dnsmasq-metrics}    Normal  Created     Created container with docker id 2d1e5adb97bb; Security:[seccomp=unconfined]
  14m   14m 1   kubelet, 10.193.20.23   spec.containers{dnsmasq-metrics}    Warning Failed      Failed to start container with docker id 2d1e5adb97bb with error: Error response from daemon: {"message":"linux spec user: unable to find group nobody: no matching entries in group file"}
  14m   14m 2   kubelet, 10.193.20.23                       Warning FailedSync  Error syncing pod, skipping: failed to "StartContainer" for "dnsmasq-metrics" with CrashLoopBackOff: "Back-off 20s restarting failed container=dnsmasq-metrics pod=kube-dns-3365905565-lb0mq_kube-system(48845c1a-50ab-11e7-a420-6805ca369d7f)"

Can you post the output of kubectl get svc,ep,rc,rs,deploy,pod -o wide --all-namespaces ? — Janos Lenart
@Janos Lenart Hi Janos,I have posted the output. By the way, I chosen the k8s version v1.5.2 for compatibility, the lastest version can't work in my demo. — hongbo wang

Janos Lenart Janos Lenart · Accepted Answer · 2017-06-13T06:21:47

So it looks like you have hit a (or several) bugs in Kubernetes. I suggest that you retry with a more recent version (possibly another docker version too). It would be a good idea to report these bugs too (https://github.com/kubernetes/dashboard/issues).

All in all, bear in mind that Kubernetes on arm is an advanced topic and you should expect problems and be ready to debug/resolve them :/

There might be a problem with that docker image (gcr.io/google_containers/dnsmasq-metrics-amd64). Non amd64 stuff is not well tested.

Could you try running:

kubectl set image --namespace=kube-system deployment/kube-dns dnsmasq-metrics=lenart/dnsmasq-metrics-arm64:1.0`

Can't reach dashboard because the dashboard Pod is unhealthy and failing the readiness probe. Because it's not ready it's not considered for the dashboard service so the service has no endpoints which leads to the error message you reported.

The dashboard is most likely unhealthy because kube-dns is not ready (1/4 containers in the Pod ready, should be 4/4).

The kube-dns is most likely unhealthy because you have no pod networking (overlay network) deployed.

Go to the add-ons, pick a network add-on and deploy it. Weave has 1.5 compatible version and requires no setup.

After you have done that give it a few minutes. If you are inpatient just delete the kubernetes-dashboard and kube-dns pods (not the deployment/controller!!). If this does not resolve your problem then please update your question with the new information.

issue on arm64: no endpoints,code:503

1 Answers