1
votes

Just getting started with Kubernetes. I cannot seem to connect pods running on different nodes to communicate with each other.

I set up a Kubernetes Cluster with Calico networking on three AWS EC2 instances (one master, two workers all with src/dest check disabled as described by the Calico website). Each instance is using the same Security Group with all TCP/UDP/ICMP ports open for 10.0.0.0/8 and 192.168.0.0/16 to make sure there is no blocked ports inside my cluster.

using a vanilla repo install

~$ sudo apt-get install -y docker.io kubelet kubeadm kubectl
~$ sudo kubeadm init --pod-network-cidr=192.168.0.0/16  

and basic Calico install

~$ kubectl apply -f https://docs.projectcalico.org/manifests/calico.yaml

joined two worker nodes to the cluster

 sudo kubeadm join <Master IP>:6443 --token <Token>  --discovery-token-ca-cert-hash sha256:<cert hash> 

Once up and running, I created three replicas for testing:

~$ kubectl run pingtest --image=busybox --replicas=3 -- sleep infinity

two on the first node and one on the second node

~$ kubectl get pod -l run=pingtest -o wide
NAME                        READY   STATUS    RESTARTS   AGE   IP               NODE              NOMINATED NODE   READINESS GATES
pingtest-7689dd958f-9mfgl   1/1     Running   0          15m   192.168.218.65   ip-10-78-31-198   <none>           <none>
pingtest-7689dd958f-l288v   1/1     Running   0          15m   192.168.218.66   ip-10-78-31-198   <none>           <none>
pingtest-7689dd958f-z2l97   1/1     Running   0          15m   192.168.237.65   ip-10-78-11-83    <none>           <none>

log into a shell on the first pod

~$ kubectl exec -ti pingtest-7689dd958f-9mfgl /bin/sh

When I ping pods on the same node everything works

/ # ping 192.168.218.66 -c 2
PING 192.168.218.66 (192.168.218.66): 56 data bytes
64 bytes from 192.168.218.66: seq=0 ttl=63 time=0.105 ms
64 bytes from 192.168.218.66: seq=1 ttl=63 time=0.078 ms

but when I ping a pod on another node, no response

/ # ping 192.168.237.65 -c 2
PING 192.168.237.65 (192.168.237.65): 56 data bytes

--- 192.168.237.65 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss

What am I missing? What is preventing communication between the pods on different nodes?

1
calico pods in kube-system namespace running?Arghya Sadhu
Yes. All four pods are running calico-kube-controllers, calico-node-8nwjj, calico-node-cqmvt, calico-node-gw6qf Tim de Vries
Can you try using calicoctl node status this should provide a bit more details what might be wrong. and check this GitHub Issue #314Crou
calicoctl node status doesn't show any errors. It returns the nodes that are on the private AWS subnet 10.78.*.*, which seems correct. The link you provided seems to be for if my the nodes had the public IP address, which is not the case for me. Any other ideas?Tim de Vries

1 Answers

2
votes

I figured out the issue. It was with the AWS configuration and some extra work you have to do in that environment.

  1. For the three AWS EC2 instances, they must all have src/dest check disabled as described by the Calico website).
  2. For the Security Group covering your AWS instances, you must add a Custom Protocol (not Custom TCP or Custom UDP), select 4 (IP in IP) in the protocol column and choose the subnets covering your instance (.eg. 10.0.0.0/8, 192.168.0.0/16). Then you can use the curl command to address you Pods/ServiceIP AWS Security Group Settings