I've been trying to use the lovely ansible-elasticsearch project to set up a nine-node Elasticsearch cluster.
Each node is up and running... but they are not communcating with each other. The master nodes think there are zero data nodes. The data nodes are not connecting to the master nodes.
They all have the same cluster.name. I have tried with multicast enabled (discovery.zen.ping.multicast.enabled: true) and disabled (previous setting to false, and discovery.zen.ping.unicast.hosts:["host1","host2",..."host9"]) but in either case the nodes are not communicating.
They have network connectivity to one another - verified via telnet over port 9300.
Sample output:
$ curl host1:9200/_cluster/health
{"error":{"root_cause":[{"type":"master_not_discovered_exception","reason":"waited for [30s]"}],"type":"master_not_discovered_exception","reason":"waited for [30s]"},"status":503}
I cannot think of any more reasons why they wouldn't connect - looking for any more ideas of what to try.
Edit: I finally resolved this issue. The settings that worked were publish_host to "_non_loopback:ipv4_" and unicast with discovery.zen.ping.unicast.hosts set to ["host1:9300","host2:9300","host3:9300"] - listing only the dedicated master nodes. I have a minimum master node count of 2.