Kubernetes Node status ready but can not be seen by scheduler

Question

I've set up a Kubernetes cluster with three nodes, i get all my nodes status ready, but the scheduler seems not find one of them. How could this happen.

[root@master1 app]# kubectl get nodes
NAME          LABELS                                         STATUS    AGE
172.16.0.44   kubernetes.io/hostname=172.16.0.44,pxc=node1   Ready     8d
172.16.0.45   kubernetes.io/hostname=172.16.0.45             Ready     8d
172.16.0.46   kubernetes.io/hostname=172.16.0.46             Ready     8d

I use nodeSelect in my RC file like thie:

  nodeSelector:
pxc: node1

describe the rc:

Name:       mongo-controller
Namespace:  kube-system
Image(s):   mongo
Selector:   k8s-app=mongo
Labels:     k8s-app=mongo
Replicas:   1 current / 1 desired
Pods Status:    0 Running / 1 Waiting / 0 Succeeded / 0 Failed
Volumes:
  mongo-persistent-storage:
    Type:   HostPath (bare host directory volume)
    Path:   /k8s/mongodb
Events:
  FirstSeen LastSeen    Count   From                SubobjectPath   Reason          Message
  ───────── ────────    ─────   ────                ─────────────   ──────          ───────
  25m       25m     1   {replication-controller }           SuccessfulCreate    Created pod: mongo-controller-0wpwu

get pods to be pending:

[root@master1 app]# kubectl get pods mongo-controller-0wpwu --namespace=kube-system
NAME                     READY     STATUS    RESTARTS   AGE
mongo-controller-0wpwu   0/1       Pending   0          27m

describe pod mongo-controller-0wpwu:

[root@master1 app]# kubectl describe pod mongo-controller-0wpwu --namespace=kube-system
Name:               mongo-controller-0wpwu
Namespace:          kube-system
Image(s):           mongo
Node:               /
Labels:             k8s-app=mongo
Status:             Pending
Reason:
Message:
IP:
Replication Controllers:    mongo-controller (1/1 replicas created)
Containers:
  mongo:
    Container ID:
    Image:      mongo
    Image ID:
    QoS Tier:
      cpu:      BestEffort
      memory:       BestEffort
    State:      Waiting
    Ready:      False
    Restart Count:  0
    Environment Variables:
Volumes:
  mongo-persistent-storage:
    Type:   HostPath (bare host directory volume)
    Path:   /k8s/mongodb
  default-token-7qjcu:
    Type:   Secret (a secret that should populate this volume)
    SecretName: default-token-7qjcu
Events:
  FirstSeen LastSeen    Count   From            SubobjectPath   Reason          Message
  ───────── ────────    ─────   ────            ─────────────   ──────          ───────
  22m       37s     12  {default-scheduler }            FailedScheduling    pod (mongo-controller-0wpwu) failed to fit in any node
fit failure on node (172.16.0.46): MatchNodeSelector
fit failure on node (172.16.0.45): MatchNodeSelector

  27m   9s  67  {default-scheduler }        FailedScheduling    pod (mongo-controller-0wpwu) failed to fit in any node
fit failure on node (172.16.0.45): MatchNodeSelector
fit failure on node (172.16.0.46): MatchNodeSelector

See the ip list in events, The 172.16.0.44 seems not seen by the scheduler? How could the happen?

describe the node 172.16.0.44

[root@master1 app]# kubectl describe nodes --namespace=kube-system
Name:           172.16.0.44
Labels:         kubernetes.io/hostname=172.16.0.44,pxc=node1
CreationTimestamp:  Wed, 30 Mar 2016 15:58:47 +0800
Phase:
Conditions:
  Type      Status      LastHeartbeatTime           LastTransitionTime          Reason          Message
  ────      ──────      ─────────────────           ──────────────────          ──────          ───────
  Ready     True        Fri, 08 Apr 2016 12:18:01 +0800     Fri, 08 Apr 2016 11:18:52 +0800     KubeletReady        kubelet is posting ready status
  OutOfDisk     Unknown     Wed, 30 Mar 2016 15:58:47 +0800     Thu, 07 Apr 2016 17:38:50 +0800     NodeStatusNeverUpdated  Kubelet never posted node status.
Addresses:  172.16.0.44,172.16.0.44
Capacity:
 cpu:       2
 memory:    7748948Ki
 pods:      40
System Info:
 Machine ID:            45461f76679f48ee96e95da6cc798cc8
 System UUID:           2B850D4F-953C-4C20-B182-66E17D5F6461
 Boot ID:           40d2cd8d-2e46-4fef-92e1-5fba60f57965
 Kernel Version:        3.10.0-123.9.3.el7.x86_64
 OS Image:          CentOS Linux 7 (Core)
 Container Runtime Version: docker://1.10.1
 Kubelet Version:       v1.2.0
 Kube-Proxy Version:        v1.2.0
ExternalID:         172.16.0.44
Non-terminated Pods:        (1 in total)
  Namespace         Name                    CPU Requests    CPU Limits  Memory Requests Memory Limits
  ─────────         ────                    ────────────    ──────────  ─────────────── ─────────────
  kube-system           kube-registry-proxy-172.16.0.44     100m (5%)   100m (5%)   50Mi (0%)   50Mi (0%)
Allocated resources:
  (Total limits may be over 100%, i.e., overcommitted. More info: http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md)
  CPU Requests  CPU Limits  Memory Requests Memory Limits
  ────────────  ──────────  ─────────────── ─────────────
  100m (5%) 100m (5%)   50Mi (0%)   50Mi (0%)
Events:
  FirstSeen LastSeen    Count   From            SubobjectPath   Reason      Message
  ───────── ────────    ─────   ────            ─────────────   ──────      ───────
  59m       59m     1   {kubelet 172.16.0.44}           Starting    Starting kubelet.

Ssh login 44, i get the disk space is free(i also remove some docker images and containers):

[root@iZ25dqhvvd0Z ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/xvda1       40G  2.6G   35G   7% /
devtmpfs        3.9G     0  3.9G   0% /dev
tmpfs           3.7G     0  3.7G   0% /dev/shm
tmpfs           3.7G  143M  3.6G   4% /run
tmpfs           3.7G     0  3.7G   0% /sys/fs/cgroup
/dev/xvdb        40G  361M   37G   1% /k8s

Still docker logs scheduler(v1.3.0-alpha.1 ) get this

E0408 05:28:42.679448       1 factory.go:387] Error scheduling kube-system mongo-controller-0wpwu: pod (mongo-controller-0wpwu) failed to fit in any node
fit failure on node (172.16.0.45): MatchNodeSelector
fit failure on node (172.16.0.46): MatchNodeSelector
; retrying
I0408 05:28:42.679577       1 event.go:216] Event(api.ObjectReference{Kind:"Pod", Namespace:"kube-system", Name:"mongo-controller-0wpwu", UID:"2d0f0844-fd3c-11e5-b531-00163e000727", APIVersion:"v1", ResourceVersion:"634139", FieldPath:""}): type: 'Warning' reason: 'FailedScheduling' pod (mongo-controller-0wpwu) failed to fit in any node
fit failure on node (172.16.0.45): MatchNodeSelector
fit failure on node (172.16.0.46): MatchNodeSelector

This solution may help you. How to restart kubernetes nodes? — CHENJIAN

Rainlight Rainlight · Accepted Answer · 2016-04-08T07:29:24

Thanks for your replay Robert. i got this resolve by doing below:

kubectl delete rc
kubectl delete node 172.16.0.44
stop kubelet in 172.16.0.44
rm -rf /k8s/*
restart kubelet

Now the node is ready, and out of disk is gone.

Name:           172.16.0.44
Labels:         kubernetes.io/hostname=172.16.0.44,pxc=node1
CreationTimestamp:  Fri, 08 Apr 2016 15:14:51 +0800
Phase:
Conditions:
  Type      Status  LastHeartbeatTime           LastTransitionTime          Reason      Message
  ────      ──────  ─────────────────           ──────────────────          ──────      ───────
  Ready     True    Fri, 08 Apr 2016 15:25:33 +0800     Fri, 08 Apr 2016 15:14:50 +0800     KubeletReady    kubelet is posting ready status
Addresses:  172.16.0.44,172.16.0.44
Capacity:
 cpu:       2
 memory:    7748948Ki
 pods:      40
System Info:
 Machine ID:            45461f76679f48ee96e95da6cc798cc8
 System UUID:           2B850D4F-953C-4C20-B182-66E17D5F6461
 Boot ID:           40d2cd8d-2e46-4fef-92e1-5fba60f57965
 Kernel Version:        3.10.0-123.9.3.el7.x86_64
 OS Image:          CentOS Linux 7 (Core)

I found this https://github.com/kubernetes/kubernetes/issues/4135, but still don't know why my disk space is free and kubelet thinks it is out of disk...

Kubernetes Node status ready but can not be seen by scheduler

3 Answers