Kubernetes - Join node failure using kubeadm

Question

I am trying to create a new kubernetes cluster by having one master and one worker node. I've completed all the configurations in the master node by using kubeadm tool. All the control plane components are running in master node and it is verified by checking the status of pod.

coredns-6955765f44-xspkr           0/1     Pending   0          8d
etcd-master-1                      1/1     Running   1          8d
kube-apiserver-master-1            1/1     Running   1          8d
kube-controller-manager-master-1   1/1     Running   1          8d
kube-proxy-8z8qr                   1/1     Running   1          8d
kube-scheduler-master-1            1/1     Running   1          8d

After installing kubectl,kubeadm,kubelet and docker in worker node, I tried to add the node in to the cluster by running kubeadmin join command by providing token and discovery token but getting below error.

I0202 22:17:57.778406   28654 token.go:78] [discovery] Failed to request cluster info: [Get https://10.0.2.15:6443/api/v1/namespaces/kube-public/configmaps/cluster-info?timeout=10s: dial tcp 10.0.2.15:6443: connect: connection refused]

I did ping the master from worker node and was able to do that. I also disabled firewall and even after that unable to join the cluster.

Are there any pre requisites to be done in worker node apart from installing the above components as i mentioned ? Any help would be highly appreciated.

New Discovery

One interesting thing I just found is the enp0s3 ip address . Although I use ip address of enp0s8 to login to the vm's , enp0s3 of both master the worker node is same which i guess is causing the issue. when I generate a token using kubeadm token create command in the master node, it gives kubeapi url with the ip of enp0s3 as (kubeadm join 10.0.2.15:6443) which seems to be common for both master and worker node.

Master enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.0.2.15 netmask 255.255.255.0 broadcast 10.0.2.255 inet6 fe80::b7:1fff:fe33:e924 prefixlen 64 scopeid 0x20<link> ether 02:b7:1f:33:e9:24 txqueuelen 1000 (Ethernet) RX packets 45801 bytes 50621300 (50.6 MB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 12270 bytes 811968 (811.9 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 Worker enp0s3: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.0.2.15 netmask 255.255.255.0 broadcast 10.0.2.255 inet6 fe80::b7:1fff:fe33:e924 prefixlen 64 scopeid 0x20<link> ether 02:b7:1f:33:e9:24 txqueuelen 1000 (Ethernet) RX packets 703 bytes 588444 (588.4 KB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 305 bytes 23784 (23.7 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 Not sure like how these vm's got same IP for enp0s3 and is there any way to overcome this issue?

"connection refused" means exactly that: the node could not connect to 10.0.2.15:6443 — zerkms
Have you run kubeadm join command on worker node immediately or later few days after you were created the cluster using kubeadm init command on master node ? — Subramanian Manickam

Piotr Malec Piotr Malec · Accepted Answer · 2020-02-03T17:25:58

It looks to me like You forgot to apply network add-on before joining worker node as in the output control plane components from Your question there is no network add-on listed.

Follow these instructions to deploy network add-on:

For example to deploy Calico as network add-on do the following on Your master node:

Calico is a networking and network policy provider. Calico supports a flexible set of networking options so you can choose the most efficient option for your situation, including non-overlay and overlay networks, with or without BGP. Calico uses the same engine to enforce network policy for hosts, pods, and (if using Istio & Envoy) applications at the service mesh layer. Calico works on several architectures, including amd64, arm64, and ppc64le.

By default, Calico uses 192.168.0.0/16 as the Pod network CIDR, though this can be configured in the calico.yaml file. For Calico to work correctly, you need to pass this same CIDR to the kubeadm init command using the --pod-network-cidr=192.168.0.0/16 flag or via the kubeadm configuration.
kubectl apply -f https://docs.projectcalico.org/v3.11/manifests/calico.yaml
Once a Pod network has been installed, you can confirm that it is working by checking that the CoreDNS Pod is Running in the output of kubectl get pods --all-namespaces. And once the CoreDNS Pod is up and running, you can continue by joining your nodes.

If your network is not working or CoreDNS is not in the Running state, checkout our troubleshooting docs.

Also remember to generate new token for joining workers as they expire after 24 hours after You initialized You master node and Your master is up for 8 days already.

As mentioned in kubernetes documentation you can use the following command on Your master node to generate new token:

kubeadm token create

Update:

When using VirtualBox VM's for kubernetes it is recommended to use bridged networking mode.

It is explained in detail in here.

There is an article about changing networking mode to bridged here.

After changing the network mode make sure to verify if network interfaces of VM's updated as well, We don't want to have network address collisions.

After changing network settings the kubernetes cluster needs to have its network configuration updated. If You didn't deploy anything in You cluster yet this could be achieved by resetting the cluster and redoing the cluster initialization. This will delete anything that was set up on Your cluster.

To do that first use kubeadm reset which performs a best effort revert of changes made to this host by ‘kubeadm init’ or ‘kubeadm join’. Then use kubeadm init with adjusted configuration for new network settings.

Remember to install kubernetes network add-on before joining other worker nodes.

Hope this helps.

Kubernetes - Join node failure using kubeadm

1 Answers