How do I troubleshoot this problem?
I have a manual setup of Kubernetes which is using as cluster internal DNS, coredns. A busybox pod has been deployed to do a nslookup on kubernetes.default.
The lookup fails with the message nslookup: can't resolve 'kubernetes.default. To get more insight what is happening during the lookup I checked the network traffic with tcpdump going out from my busybox pod. This shows that my pod can connect successfully to the coredns pod but the coredns pod will fail to connect back:
10:25:53.328153 IP 10.200.0.29.49598 > 10.32.0.10.domain: 2+ PTR? 10.0.32.10.in-addr.arpa. (41)
10:25:53.328393 IP 10.200.0.30.domain > 10.200.0.29.49598: 2* 1/0/0 PTR kube-dns.kube-system.svc.cluster.local. (93)
10:25:53.328410 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 49598 unreachable, length 129
10:25:58.328516 IP 10.200.0.29.50899 > 10.32.0.10.domain: 3+ PTR? 10.0.32.10.in-addr.arpa. (41)
10:25:58.328738 IP 10.200.0.30.domain > 10.200.0.29.50899: 3* 1/0/0 PTR kube-dns.kube-system.svc.cluster.local. (93)
10:25:58.328752 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 50899 unreachable, length 129
10:25:58.343205 ARP, Request who-has 10.200.0.1 tell 10.200.0.29, length 28
10:25:58.343217 ARP, Reply 10.200.0.1 is-at 0a:58:0a:c8:00:01 (oui Unknown), length 28
10:25:58.351250 ARP, Request who-has 10.200.0.29 tell 10.200.0.30, length 28
10:25:58.351250 ARP, Request who-has 10.200.0.30 tell 10.200.0.29, length 28
10:25:58.351261 ARP, Reply 10.200.0.29 is-at 0a:58:0a:c8:00:1d (oui Unknown), length 28
10:25:58.351262 ARP, Reply 10.200.0.30 is-at 0a:58:0a:c8:00:1e (oui Unknown), length 28
10:26:03.331409 IP 10.200.0.29.45823 > 10.32.0.10.domain: 4+ PTR? 10.0.32.10.in-addr.arpa. (41)
10:26:03.331618 IP 10.200.0.30.domain > 10.200.0.29.45823: 4* 1/0/0 PTR kube-dns.kube-system.svc.cluster.local. (93)
10:26:03.331631 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 45823 unreachable, length 129
10:26:08.348259 IP 10.200.0.29.43332 > 10.32.0.10.domain: 5+ PTR? 10.0.32.10.in-addr.arpa. (41)
10:26:08.348492 IP 10.200.0.30.domain > 10.200.0.29.43332: 5* 1/0/0 PTR kube-dns.kube-system.svc.cluster.local. (93)
10:26:08.348506 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 43332 unreachable, length 129
10:26:13.353491 IP 10.200.0.29.55715 > 10.32.0.10.domain: 6+ AAAA? kubernetes.default. (36)
10:26:13.354955 IP 10.200.0.30.domain > 10.200.0.29.55715: 6 NXDomain* 0/0/0 (36)
10:26:13.354971 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 55715 unreachable, length 72
10:26:18.354285 IP 10.200.0.29.57421 > 10.32.0.10.domain: 7+ AAAA? kubernetes.default. (36)
10:26:18.355533 IP 10.200.0.30.domain > 10.200.0.29.57421: 7 NXDomain* 0/0/0 (36)
10:26:18.355550 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 57421 unreachable, length 72
10:26:23.359405 IP 10.200.0.29.44332 > 10.32.0.10.domain: 8+ AAAA? kubernetes.default. (36)
10:26:23.361155 IP 10.200.0.30.domain > 10.200.0.29.44332: 8 NXDomain* 0/0/0 (36)
10:26:23.361171 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 44332 unreachable, length 72
10:26:23.367220 ARP, Request who-has 10.200.0.30 tell 10.200.0.29, length 28
10:26:23.367232 ARP, Reply 10.200.0.30 is-at 0a:58:0a:c8:00:1e (oui Unknown), length 28
10:26:23.370352 ARP, Request who-has 10.200.0.1 tell 10.200.0.29, length 28
10:26:23.370363 ARP, Reply 10.200.0.1 is-at 0a:58:0a:c8:00:01 (oui Unknown), length 28
10:26:28.367698 IP 10.200.0.29.48446 > 10.32.0.10.domain: 9+ AAAA? kubernetes.default. (36)
10:26:28.369133 IP 10.200.0.30.domain > 10.200.0.29.48446: 9 NXDomain* 0/0/0 (36)
10:26:28.369149 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 48446 unreachable, length 72
10:26:33.381266 IP 10.200.0.29.50714 > 10.32.0.10.domain: 10+ A? kubernetes.default. (36)
10:26:33.382745 IP 10.200.0.30.domain > 10.200.0.29.50714: 10 NXDomain* 0/0/0 (36)
10:26:33.382762 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 50714 unreachable, length 72
10:26:38.386288 IP 10.200.0.29.39198 > 10.32.0.10.domain: 11+ A? kubernetes.default. (36)
10:26:38.388635 IP 10.200.0.30.domain > 10.200.0.29.39198: 11 NXDomain* 0/0/0 (36)
10:26:38.388658 IP 10.200.0.29 > 10.200.0.30: ICMP 10.200.0.29 udp port 39198 unreachable, length 72
10:26:38.395241 ARP, Request who-has 10.200.0.29 tell 10.200.0.30, length 28
10:26:38.395248 ARP, Reply 10.200.0.29 is-at 0a:58:0a:c8:00:1d (oui Unknown), length 28
10:26:43.389355 IP 10.200.0.29.46495 > 10.32.0.10.domain: 12+ A? kubernetes.default. (36)
10:26:43.391522 IP 10.200.0.30.domain > 10.200.0.29.46495: 12 NXDomain* 0/0/0 (36)
10:26:43.391539 IP 10.200.0.2
Cluster Infrastructure
NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
default deploy/busybox 1 1 1 1 1h
kube-system deploy/coredns 1 1 1 1 17h
NAMESPACE NAME DESIRED CURRENT READY AGE
default rs/busybox-56db8bd9d7 1 1 1 1h
kube-system rs/coredns-b8d4b46c8 1 1 1 17h
NAMESPACE NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
default deploy/busybox 1 1 1 1 1h
kube-system deploy/coredns 1 1 1 1 17h
NAMESPACE NAME DESIRED CURRENT READY AGE
default rs/busybox-56db8bd9d7 1 1 1 1h
kube-system rs/coredns-b8d4b46c8 1 1 1 17h
NAMESPACE NAME READY STATUS RESTARTS AGE
default po/busybox-56db8bd9d7-fv7np 1/1 Running 2 1h
kube-system po/coredns-b8d4b46c8-6tg5d 1/1 Running 2 17h
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default svc/kubernetes ClusterIP 10.32.0.1 <none> 443/TCP 22h
kube-system svc/kube-dns ClusterIP 10.32.0.10 <none> 53/UDP,53/TCP,9153/TCP 17h
Busybox IP
kubectl describe pod busybox-56db8bd9d7-fv7np | grep IP
IP: 10.200.0.29
EndPoints IP to see DNS IP and port
kubectl get endpoints --all-namespaces
NAMESPACE NAME ENDPOINTS AGE
default kubernetes 192.168.0.218:6443 22h
kube-system kube-controller-manager <none> 22h
kube-system kube-dns 10.200.0.30:9153,10.200.0.30:53,10.200.0.30:53 2h
kube-system kube-scheduler <none> 22h