0
votes

I am trying to setup my first cluster using Kubernetes 1.13.1. The master got initialized okay, but both of my worker nodes are NotReady. kubectl describe node shows that Kubelet stopped posting node status on both worker nodes. On one of the worker nodes I get log output like

> kubelet[3680]: E0107 20:37:21.196128    3680 kubelet.go:2266] node
> "xyz" not found.

Here is the full details:

I am using Centos 7 & Kubernetes 1.13.1.

Initializing was done as follows:

[root@master ~]# kubeadm init --apiserver-advertise-address=10.142.0.4 --pod-network-cidr=10.142.0.0/24

Successfully initialized the cluster:

You can now join any number of machines by running the following on each node
as root:
`kubeadm join 10.142.0.4:6443 --token y0epoc.zan7yp35sow5rorw --discovery-token-ca-cert-hash sha256:f02d43311c2696e1a73e157bda583247b9faac4ffb368f737ee9345412c9dea4`

deployed the flannel CNI:

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

The join command worked fine.

[kubelet-start] Activating the kubelet service [tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap... [patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "node01" as an annotation

This node has joined the cluster:

* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.
Run 'kubectl get nodes' on the master to see this node join the cluster.

Result of kubectl get nodes:

[root@master ~]# kubectl get nodes

NAME     STATUS     ROLES    AGE   VERSION

master   Ready      master   9h    v1.13.1

node01   NotReady   <none>   9h    v1.13.1

node02   NotReady   <none>   9h    v1.13.1

on both nodes:

[root@node01 ~]# service kubelet status Redirecting to /bin/systemctl status kubelet.service ● kubelet.service - kubelet: The Kubernetes Node Agent

 Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)

  Drop-In: /etc/systemd/system/kubelet.service.d
           └─10-kubeadm.conf

   Active: active (running) since Tue 2019-01-08 04:49:20 UTC; 32s ago

     Docs: https://kubernetes.io/docs/

 Main PID: 4224 (kubelet)

   Memory: 31.3M

   CGroup: /system.slice/kubelet.service
           └─4224 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfi

`Jan 08 04:54:10 node01 kubelet[4224]: E0108 04:54:10.957115    4224 kubelet.go:2266] node "node01" not found`

I appreciate your advise on how to troubleshoot this.

3
There are overwhelming number of ways of setting up a kubernetes cluster. kubeadm, kubespray and kops to name a few. It is virtually impossible to tell what you are doing wrong, since you are not telling what you are doing. I suggest that you double check that your executed all steps required in the installation procedure you are following, and did not miss anything. A wild guess would be that you have not configure your pod network properly, but really could be anything. - Andrew Savinykh
I reformatted this question to make it clearer which specific kubectl command you ran and to highlight the output. It would be helpful if you could edit this further to add details of what process you're using to set up the cluster. - David Maze
@AndrewSavinykh thanks for help , edited with more info - Falcon
@Falcon, hey, did you double check that you have not missed a step? It sound like you have not configured your pod network properly. This is the guide for kubeadm to check against. - Andrew Savinykh
Could you please provide full kubelet logs from node1 journalctl -u kubelet - Prafull Ladha

3 Answers

1
votes

The previous answer sounds correct. You can verify that by running kubectl describe node node01 on the master, or wherever kubectl is correctly configured.

0
votes

It seems like the reason of this error is due to incorrect subnet. In Flannel documentation it is written that you should use /16 not /24 for pod network.

NOTE: If kubeadm is used, then pass --pod-network-cidr=10.244.0.0/16 to kubeadm init to ensure that the podCIDR is set.

I tried to run kubeadm with /24 and although I had nodes in Ready state the flannel pods did not run properly which resulted in some issues.

You can check if your flannel pods are running properly by: kubectl get pods -n kube-system if the status is other than running then it is incorrect behavior. In this case you can check details by running kubectl describe pod PODNAME -n kube-system. Try changing the subnet and update us if that fixed the problem.

0
votes

I ran into almost the same problem, and in the end I found that the reason was that the firewall was not turned off. You can try the following commands:

sudo ufw disable

or

systemctl disable firewalld

or

setenforce 0