4
votes

I am trying to use the Kubernetes 1.7.12 fluentd-elasticsearch addon: https://github.com/kubernetes/kubernetes/tree/v1.7.12/cluster/addons/fluentd-elasticsearch

ElasticSearch starts up and can respond with:

{
 "name" : "0322714ad5b7",
 "cluster_name" : "kubernetes-logging",
 "cluster_uuid" : "_na_",
 "version" : {
   "number" : "2.4.1",
   "build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16",
   "build_timestamp" : "2016-09-27T18:57:55Z",
   "build_snapshot" : false,
   "lucene_version" : "5.5.2"
 },
 "tagline" : "You Know, for Search"
}

But Kibana is still unable to connect to it. The connection error starts out with:

{"type":"log","@timestamp":"2018-01-23T07:42:06Z","tags":["warning","elasticsearch"],"pid":6,"message":"Unable to revive connection: http://elasticsearch-logging:9200/"}
{"type":"log","@timestamp":"2018-01-23T07:42:06Z","tags":["warning","elasticsearch"],"pid":6,"message":"No living connections"}

And after ElasticSearch is up, the error changes to:

{"type":"log","@timestamp":"2018-01-23T07:42:08Z","tags":["status","plugin:[email protected]","error"],"pid":6,"state":"red","message":"Status changed from red to red - Service Unavailable","prevState":"red","prevMsg":"Unable to connect to Elasticsearch at http://elasticsearch-logging:9200."}

So it seems as though, Kibana is finally able to get a response from ElasticSearch, but a connection still cannot be established.

This is what the Kibana dashboard looks like: enter image description here

I tried to get the logs to output more information, but do not have enough knowledge about Kibana and ElasticSearch to know what else I can try next.

I am able to reproduce the error locally using this docker-compose.yml:

version: '2'
services:
 elasticsearch-logging:
   image: gcr.io/google_containers/elasticsearch:v2.4.1-2
   ports:
     - "9200:9200"
     - "9300:9300"

 kibana-logging:
   image: gcr.io/google_containers/kibana:v4.6.1-1
   ports:
     - "5601:5601"
   depends_on:
     - elasticsearch-logging
   environment:
     - ELASTICSEARCH_URL=http://elasticsearch-logging:9200

It doesn't look like there should be much involved based on what I can tell from this question: Kibana on Docker cannot connect to Elasticsearch and this blog: https://gunith.github.io/docker-kibana-elasticsearch/

But I can't figure out what I'm missing.

Any ideas what else I might be able to try?

Thank you for your time. :)

Update 1:

curling http://elasticsearch-logging on the Kubernetes cluster resulted in the same output:

{
  "name" : "elasticsearch-logging-v1-68km4",
  "cluster_name" : "kubernetes-logging",
  "cluster_uuid" : "_na_",
  "version" : {
    "number" : "2.4.1",
    "build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16",
    "build_timestamp" : "2016-09-27T18:57:55Z",
    "build_snapshot" : false,
    "lucene_version" : "5.5.2"
  },
  "tagline" : "You Know, for Search"
}

curling http://elasticsearch-logging/_cat/indices?pretty on the Kubernetes cluster timed out because of a proxy rule. Using the docker-compose.yml and curling locally (e.g. curl localhost:9200/_cat/indices?pretty) results in:

{
  "error" : {
    "root_cause" : [ {
      "type" : "master_not_discovered_exception",
      "reason" : null
    } ],
    "type" : "master_not_discovered_exception",
    "reason" : null
  },
  "status" : 503
}

The docker-compose logs show:

[2018-01-23 17:04:39,110][DEBUG][action.admin.cluster.state] [ac1f2a13a637] no known master node, scheduling a retry

[2018-01-23 17:05:09,112][DEBUG][action.admin.cluster.state] [ac1f2a13a637] timed out while retrying [cluster:monitor/state] after failure (timeout [30s])
[2018-01-23 17:05:09,116][WARN ][rest.suppressed          ] path: /_cat/indices, params: {pretty=}
MasterNotDiscoveredException[null]
     at org.elasticsearch.action.support.master.TransportMasterNodeAction$AsyncSingleAction$5.onTimeout(TransportMasterNodeAction.java:234)
     at org.elasticsearch.cluster.ClusterStateObserver$ObserverClusterStateListener.onTimeout(ClusterStateObserver.java:236)
     at org.elasticsearch.cluster.service.InternalClusterService$NotifyTimeout.run(InternalClusterService.java:804)
     at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
     at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
     at java.lang.Thread.run(Thread.java:745)

Update 2: Running kubectl --namespace kube-system logs -c kubedns po/kube-dns-667321983-dt5lz --tail 50 --follow yields:

I0124 16:43:33.591112       5 dns.go:264] New service: kibana-logging
I0124 16:43:33.591225       5 dns.go:264] New service: nginx
I0124 16:43:33.591251       5 dns.go:264] New service: registry
I0124 16:43:33.591274       5 dns.go:264] New service: sudoe
I0124 16:43:33.591295       5 dns.go:264] New service: default-http-backend
I0124 16:43:33.591317       5 dns.go:264] New service: kube-dns
I0124 16:43:33.591344       5 dns.go:462] Added SRV record &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
I0124 16:43:33.591369       5 dns.go:462] Added SRV record &{Host:kube-dns.kube-system.svc.cluster.local. Port:53 Priority:10 Weight:10 Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
I0124 16:43:33.591390       5 dns.go:264] New service: kubernetes
I0124 16:43:33.591409       5 dns.go:462] Added SRV record &{Host:kubernetes.default.svc.cluster.local. Port:443 Priority:10 Weight:10 Text: Mail:false Ttl:30 TargetStrip:0 Group: Key:}
I0124 16:43:33.591429       5 dns.go:264] New service: elasticsearch-logging

Update 3:

I'm still trying to get everything to work, but with the help of others, I am confident it is a RBAC issue. I'm not completely sure, but it looks like the elasticsearch nodes were not able to connect with the master (which I never knew was even needed) due to permissions.

Here are some steps that helped, in case it helps others starting out:

with RBAC on:

# kubectl --kubeconfig kubeconfig.yaml --namespace kube-system logs po/elasticsearch-logging-v1-wkwcs
F0119 00:18:44.285773       9 elasticsearch_logging_discovery.go:60] kube-system namespace doesn't exist: User "system:serviceaccount:kube-system:default" cannot get namespaces in the namespace "kube-system". (get namespaces kube-system)
goroutine 1 [running]:
k8s.io/kubernetes/vendor/github.com/golang/glog.stacks(0x1f7f600, 0xc400000000, 0xee, 0x1b2)
        vendor/github.com/golang/glog/glog.go:766 +0xa5
k8s.io/kubernetes/vendor/github.com/golang/glog.(*loggingT).output(0x1f5f5c0, 0xc400000003, 0xc42006c300, 0x1ef20c8, 0x22, 0x3c, 0x0)
        vendor/github.com/golang/glog/glog.go:717 +0x337
k8s.io/kubernetes/vendor/github.com/golang/glog.(*loggingT).printf(0x1f5f5c0, 0xc400000003, 0x16949d6, 0x1e, 0xc420579ee8, 0x2, 0x2)
        vendor/github.com/golang/glog/glog.go:655 +0x14c
k8s.io/kubernetes/vendor/github.com/golang/glog.Fatalf(0x16949d6, 0x1e, 0xc420579ee8, 0x2, 0x2)
        vendor/github.com/golang/glog/glog.go:1145 +0x67
main.main()
        cluster/addons/fluentd-elasticsearch/es-image/elasticsearch_logging_discovery.go:60 +0xb53
[2018-01-19 00:18:45,273][INFO ][node                     ] [elasticsearch-logging-v1-wkwcs] version[2.4.1], pid[5], build[c67dc32/2016-09-27T18:57:55Z]
[2018-01-19 00:18:45,275][INFO ][node                     ] [elasticsearch-logging-v1-wkwcs] initializing ...
# kubectl --kubeconfig kubeconfig.yaml --namespace kube-system exec kibana-logging-2104905774-69wgv curl elasticsearch-logging.kube-system:9200/_cat/indices?pretty

{
  "error" : {
    "root_cause" : [ {
      "type" : "master_not_discovered_exception",
      "reason" : null
    } ],
    "type" : "master_not_discovered_exception",
    "reason" : null
  },
  "status" : 503
}

With RBAC off:

#  kubectl --kubeconfig kubeconfig.yaml --namespace kube-system log elasticsearch-logging-v1-7shgk
[2018-01-26 01:19:52,294][INFO ][node                     ] [elasticsearch-logging-v1-7shgk] version[2.4.1], pid[5], build[c67dc32/2016-09-27T18:57:55Z]
[2018-01-26 01:19:52,294][INFO ][node                     ] [elasticsearch-logging-v1-7shgk] initializing ...
[2018-01-26 01:19:53,077][INFO ][plugins                  ] [elasticsearch-logging-v1-7shgk] modules [reindex, lang-expression, lang-groovy], plugins [], sites []
#  kubectl --kubeconfig kubeconfig.yaml --namespace kube-system exec elasticsearch-logging-v1-7shgk curl http://elasticsearch-logging:9200/_cat/indices?pretty
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    40  100    40    0     0      2      0  0:00:20  0:00:15  0:00:05    10
green open .kibana 1 1 1 0 6.2kb 3.1kb 

Thanks everyone for your help :)

2
Are elasticsearch and kibana deployed in the same namespace? Could you access the kibana container via a command line and launch some debugging commands?whites11
@whites11, yes they are deployed to the same namespace, kube-system. I can do something like kubectl exec -it po/podname. Is that what you mean? What kind of debugging commands can I run?Zhao Li
yeah that's what I mean. Try running curl http://elasticsearch-logging:9200 from the kibana podwhites11
I'll try it on the kubernetes cluster tomorrow, but when I run it in the container using the docker-compose.yml, I get this: { "name" : "0322714ad5b7", "cluster_name" : "kubernetes-logging", "cluster_uuid" : "na", "version" : { "number" : "2.4.1", "build_hash" : "c67dc32e24162035d18d6fe1e952c4cbcbe79d16", "build_timestamp" : "2016-09-27T18:57:55Z", "build_snapshot" : false, "lucene_version" : "5.5.2" }, "tagline" : "You Know, for Search" }Zhao Li
Ok and what does curl http://elasticsearch-logging:9200/_cat/indices?pretty say?whites11

2 Answers

3
votes

A few troubleshooting tips:

1) ensure ElasticSearch is running fine.

Enter the container running elasticsearch and run:

curl localhost:9200

You should get a JSON, with some data about elasticsearch.

2) ensure ElasticSearch is reachable from the kibana container

Enter the kibana container and run:

curl <elasticsearch_service_name>:9200

You should get the same output as above.

3) Ensure your ES indices are fine.

Run the following command from the elasticsearch container:

curl localhost:9200/_cat/indices?pretty

You should get a table with all indices in your ES cluster and their status (which should be green or yellow in case you only have one ES replica).

If one of the above points fails, check the logs of your ES container for any error messages and try to solve them.

1
votes

This exception indicates 2 misconfiguration 1. DNS Addon of Kubernetes is not working properly. Check your dns addon logs 2. Pod 2 Pod communication is not working properly. This is related with your underlying sdn addon cni flannel calico.

You can check by pinging one pod from another pod. If it is not working than check your networking configuration especially kube-proxy component.