My objective is to create a elasticsearch cluster in AWS using EC2 discovery.
I have 3 instances each running elasticsearch. I have provided each instance a IAM role which allows them to describe ec2 data. Each instance is inside the security group "sec-group-elasticsearch"
The nodes start but do not find each other (logs below).
I can telnet from one node to another using private dns and port 9300.
Reference
eg. telnet from node A->B works and B->A works.
telnet ip-xxx-xxx-xx-xxx.vpc.fakedomain.com 9300
iam role for each instance
{
"Statement": [
{
"Action": [
"ec2:DescribeInstances"
],
"Effect": "Allow",
"Resource": [
"*"
]
}
],
"Version": "2012-10-17"
}
sec group rules
Inbound
Custom TCP Rule TCP 9200 - 9400 0.0.0.0/0
Outbound
All traffic allowed
elasticsearch.yml
bootstrap.mlockall: false
cloud.aws.region: us-east
cluster.name: my-ec2-elasticsearch
discovery: ec2
discovery.ec2.groups: sec-group-elasticsearch
discovery.ec2.host_type: private_dns
discovery.ec2.ping_timeout: 30s
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false
http.port: 9200
network.host: _ec2:privateDns_
node.data: false
node.master: true
transport.tcp.port: 9300
On startup each instance logs like so:
[2016-03-02 03:13:48,128][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] version[2.1.0], pid[26976], build[72cd1f1/2015-11-18T22:40:03Z]
[2016-03-02 03:13:48,129][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] initializing ...
[2016-03-02 03:13:48,592][INFO ][plugins ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] loaded [cloud-aws], sites [head]
[2016-03-02 03:13:48,620][INFO ][env ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [11.4gb], net total_space [14.6gb], spins? [no], types [ext4]
[2016-03-02 03:13:50,928][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] initialized
[2016-03-02 03:13:50,928][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] starting ...
[2016-03-02 03:13:51,065][INFO ][transport ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com/xxx-xxx-xx-xxx:9300}, bound_addresses {xxx-xxx-xx-xxx:9300}
[2016-03-02 03:13:51,074][INFO ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] my-ec2-elasticsearch/xVOkfK4TT-GWaPln59wGxw
[2016-03-02 03:14:21,075][WARN ][discovery ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] waited for 30s and no initial state was set by the discovery
[2016-03-02 03:14:21,084][INFO ][http ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com/xxx-xxx-xx-xxx:9200}, bound_addresses {xxx-xxx-xx-xxx:9200}
[2016-03-02 03:14:21,085][INFO ][node ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] started
TRACE LOGGING ON FOR DISCOVERY:
2016-03-02 04:25:27,753][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] failed to connect to {#zen_unicast_2#}{::1}{[::1]:9300}
ConnectTransportException[[][[::1]:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /0:0:0:0:0:0:0:1:9300];
at org.elasticsearch.transport.netty.NettyTransport.connectToChannelsLight(NettyTransport.java:916)
at ..............
[2016-03-02 04:25:29,253][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] connecting (light) to {#zen_unicast_1#}{127.0.0.1}{127.0.0.1:9300}
[2016-03-02 04:25:29,253][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] sending to {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}
[2016-03-02 04:25:29,254][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] received response from {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}: [ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[143], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[145], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[147], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[149], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[151], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[153], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}, ping_response{node [{ip-xxx-xxx-xx-xxx.vpc.fakedomain.com}{jtq31eB_Td-GpnxREFytLg}{xxx-xxx-xx-xxx}{ip-xxx-xxx-xx-xxx.vpc.team.getgoing.com/xxx-xxx-xx-xxx:9300}{data=false, master=true}], id[154], master [null], hasJoinedOnce [false], cluster_name[my-ec2-elasticsearch]}]
[2016-03-02 04:25:29,253][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] connecting (light) to {#zen_unicast_2#}{::1}{[::1]:9300}
[2016-03-02 04:25:29,254][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] failed to connect to {#zen_unicast_1#}{127.0.0.1}{127.0.0.1:9300}
ConnectTransportException[[][127.0.0.1:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /127.0.0.1:9300];
at ...........
[2016-03-02 04:25:29,255][TRACE][discovery.zen.ping.unicast] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com] [26] failed to connect to {#zen_unicast_2#}{::1}{[::1]:9300}
ConnectTransportException[[][[::1]:9300] connect_timeout[30s]]; nested: ConnectException[Connection refused: /0:0:0:0:0:0:0:1:9300];
at
discovery.zen.ping.unicast.hosts: ["a.b.c.d"]
in your configuration. List all of your nodes in that list. – Vallogging.yml
file in order to increase the logging level for the EC2 plugin withdiscovery: TRACE
. It should print out a few more info on the discovery process. – Val