2
votes

I have installed Kubernetes(1.17.3) on one server(not VM) with flannel(v0.11.0-amd64) using kubeadm. Then i installed grafana and prometheus and can access both on NodePort http://<serverip>:31000

Now when i tries to access prometheus service from grafana its giving error Could not resolve host: prometheus-server;

I start troubleshooting and perform following steps

  • verified that podsCIDR is configurered

    kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}' 10.244.0.0/24

  • Curl on IPs and DNS Name of service

    # curl 10.244.0.33:9090 <a href="/prometheus/graph">Found</a>

    # curl 10.109.215.27:9090 <a href="/prometheus/graph">Found</a>

    # curl http://prometheus-server:9090 curl: (6) Could not resolve host: prometheus-server; Unknown error

  • My /etc/resolv.conf was empty and i added below entry but still no success

    search cluster.local nameserver <IP of Server>

  • Following is output of CoreDNS logs

    kubectl logs -f coredns-6955765f44-cnhtz -n kube-system .:53 [INFO] plugin/reload: Running configuration MD5 = 4e235fcc3696966e76816bcd9034ebc7 CoreDNS-1.6.5 linux/amd64, go1.13.4, c2fd1b2 [ERROR] plugin/errors: 2 2339874627451903403.2757028323724952357. HINFO: read udp 10.244.0.13:38879->8.8.4.4:53: read: no route to host [ERROR] plugin/errors: 2 2339874627451903403.2757028323724952357. HINFO: read udp 10.244.0.13:53266->8.8.4.4:53: i/o timeout [ERROR] plugin/errors: 2 2339874627451903403.2757028323724952357. HINFO: read udp 10.244.0.13:37289->8.8.8.8:53: i/o timeout [ERROR] plugin/errors: 2 2339874627451903403.2757028323724952357. HINFO: read udp 10.244.0.13:44281->8.8.4.4:53: read: no route to host

Update1:

In response to @KoopaKiller I run curl http://prometheus-server:9090? from host and from grafana pod (from grafana pods infact its not responding to IPs). I installed prometheus and grafana with manifest and both are in same namespace.

kubectl get pods -A
NAMESPACE              NAME                                             READY   STATUS          
kube-system            coredns-6955765f44-cnhtz                         1/1     Running         
kube-system            coredns-6955765f44-d9wrj                         1/1     Running         
kube-system            kube-flannel-ds-amd64-rbsbv                      1/1     Running         
kube-system            kube-proxy-nblnq                                 1/1     Running         
monitoring-logging     grafana-b57ccddf9-p7w2q                          1/1     Running                 
monitoring-logging     prometheus-server-65d7dc7999-frd8k               2/2     Running 

One more thing which i observed in events of coredns is about missing file "/run/flannel/subnet.env" but file is available it look like it get recreate on every reboot and CoreDNS find it very late.

Events:
  Type     Reason                   Message
  ----     ------                   -------
  Warning  FailedCreatePodSandBox   Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "d69af6411310ae3c4865a3ddce0667a40092b0dcf55eb5f8ddb642e503dcc0c5" network for pod "coredns-6955765f44-d9wrj": networkPlugin cni failed to set up pod "coredns-6955765f44-d9wrj_kube-system" network: open /run/flannel/subnet.env: no such file or directory
  Warning  FailedCreatePodSandBox   Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "b6199b3ce4a769c0ccfef6f247763beb1ca0231de52f6309d2b2f122844746ee" network for pod "coredns-6955765f44-d9wrj": networkPlugin cni failed to set up pod "coredns-6955765f44-d9wrj_kube-system" network: open /run/flannel/subnet.env: no such file or directory
  Normal   SandboxChanged           Pod sandbox changed, it will be killed and re-created.
  Warning  FailedCreatePodSandBox   Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "097dbf97858d8ea3510e8337eb9b0bc8baf966ab51a2a56971e8ae54c5b516a6" network for pod "coredns-6955765f44-d9wrj": networkPlugin cni failed to set up pod "coredns-6955765f44-d9wrj_kube-system" network: open /run/flannel/subnet.env: no such file or directory
  Normal   Pulled                   Container image "k8s.gcr.io/coredns:1.6.5" already present on machine
  Normal   Created                  Created container coredns
  Normal   Started                  Started container coredns

Update2: I followed link to debug DNS and it shows result for

kubectl exec -ti dnsutils -- nslookup kubernetes.default
kubectl exec dnsutils cat /etc/resolv.conf

Then i added the log plugin to the CoreDNS configuration and realize no DNS queries being received by CoreDNS, i disabled my firewalld and everything start working as expectation, But why its not working with firewalld my open ports are following, it has ports for flannel too

firewall-cmd --list-ports
6443/tcp 2379-2380/tcp 10250/tcp 10251/tcp 10252/tcp 30000-32767/tcp 8080/tcp 8443/tcp 8285/udp 8472/udp 502/tcp
1
did you pass --pod-network-cidr=10.244.0.0/16 while creating cluster?hoque
@hoque yes thats why i shared output of kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}'ImranRazaKhan
From where are you trying to run the command curl http://prometheus-server:9090? How did you installed prometheus and grafana? Both are in the same or different namespaces? Your CoreDNS pods are health? Please take a look in this command kubectl get pods -A and see if there are any pod not runningMr.KoopaKiller
@KoopaKiller I updates question with Update1:ImranRazaKhan
Let's try to debug your dns resolution, could you create a dnsutil pod with this command: kubectl apply -f https://k8s.io/examples/admin/dns/dnsutils.yaml, and after that try to reach prometheus and grafana using the command kubectl exec -ti dnsutils -- nslookup <service>. Please post the results. See the documentation page hereMr.KoopaKiller

1 Answers

4
votes

To make it work without disabling firewalld i have to add below rule and everything start working with dnsnames

firewall-cmd --add-masquerade --permanent
firewall-cmd --reload
systemctl restart firewalld

I got hint from below link, but will look into more details why we need it?

How can I use Flannel without disabing firewalld (Kubernetes)