1
votes

My objective is to run a 6 node cluster on three instances in EC2. I am placing one master-only and one data-only node on each instance (using the elastic ansible playbook).

The master nodes from each of the three instances all find each other without issue using EC2 discovery and form a cluster of three and elect a master. The data nodes from the same instances fail on startup with the error below.

What have I tried
- switching data nodes to explicit zen.unicast discovery via hostnames works
- I can telnet on port 9301 from instance A->B without issue

REFERENCE:
java version - OpenJDK Runtime Environment (IcedTea 2.5.6) (7u79-2.5.6-0ubuntu1.14.04.1) es version - 2.1.0

data node elasticseach.yml

bootstrap.mlockall: false
cloud.aws.region: us-east
cluster.name: my-cluster
discovery.ec2.groups: stage-elasticsearch
discovery.ec2.host_type: private_dns
discovery.ec2.ping_timeout: 30s
discovery.type: ec2
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false
gateway.expected_nodes: 4
http.port: 9201
network.host: _ec2:privateDns_
node.data: true
node.master: false
transport.tcp.port: 9301
node.name: ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1

master node elasticsearch.yml

bootstrap.mlockall: false
cloud.aws.region: us-east
cluster.name: my-cluster
discovery.ec2.groups: stage-elasticsearch
discovery.ec2.host_type: private_dns
discovery.ec2.ping_timeout: 30s
discovery.type: ec2
discovery.zen.minimum_master_nodes: 2
discovery.zen.ping.multicast.enabled: false
gateway.expected_nodes: 4
http.port: 9200
network.host: _ec2:privateDns_
node.data: false 
node.master: true 
transport.tcp.port: 9300
node.name: ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-master

Errors from datanode startup:

[2016-03-02 15:45:06,246][INFO ][node                     ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] initializing ...
[2016-03-02 15:45:06,679][INFO ][plugins                  ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] loaded [cloud-aws], sites [head]
[2016-03-02 15:45:06,710][INFO ][env                      ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] using [1] data paths, mounts [[/ (/dev/xvda1)]], net usable_space [11.5gb], net total_space [14.6gb], spins? [no], types [ext4]
[2016-03-02 15:45:09,597][INFO ][node                     ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] initialized
[2016-03-02 15:45:09,597][INFO ][node                     ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] starting ...
[2016-03-02 15:45:09,678][INFO ][transport                ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1/xxx-xxx-xx-xxx:9301}, bound_addresses {xxx-xxx-xx-xxx:9301}
[2016-03-02 15:45:09,687][INFO ][discovery                ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] my-cluster/PNI6WAmzSYGgZcX2HsqenA
[2016-03-02 15:45:09,701][WARN ][com.amazonaws.jmx.SdkMBeanRegistrySupport]
java.security.AccessControlException: access denied ("javax.management.MBeanServerPermission" "findMBeanServer")
    at java.security.AccessControlContext.checkPermission(AccessControlContext.java:372)
    at java.security.AccessController.checkPermission(AccessController.java:559)
    at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
    at javax.management.MBeanServerFactory.checkPermission(MBeanServerFactory.java:413)
    at javax.management.MBeanServerFactory.findMBeanServer(MBeanServerFactory.java:361)
    at com.amazonaws.jmx.MBeans.getMBeanServer(MBeans.java:111)
    at com.amazonaws.jmx.MBeans.registerMBean(MBeans.java:50)
    at com.amazonaws.jmx.SdkMBeanRegistrySupport.registerMetricAdminMBean(SdkMBeanRegistrySupport.java:27)
    at com.amazonaws.metrics.AwsSdkMetrics.registerMetricAdminMBean(AwsSdkMetrics.java:355)
    at com.amazonaws.metrics.AwsSdkMetrics.<clinit>(AwsSdkMetrics.java:316)
    at com.amazonaws.AmazonWebServiceClient.requestMetricCollector(AmazonWebServiceClient.java:563)
    at com.amazonaws.AmazonWebServiceClient.isRMCEnabledAtClientOrSdkLevel(AmazonWebServiceClient.java:504)
    at com.amazonaws.AmazonWebServiceClient.isRequestMetricsEnabled(AmazonWebServiceClient.java:496)
    at com.amazonaws.AmazonWebServiceClient.createExecutionContext(AmazonWebServiceClient.java:457)
    at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5924)
    at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.fetchDynamicNodes(AwsEc2UnicastHostsProvider.java:118)
    at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:230)
    at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:215)
    at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:55)
    at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:104)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:335)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240)
    at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106)
    at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84)
    at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:879)
    at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:335)
    at org.elasticsearch.discovery.zen.ZenDiscovery.access$5000(ZenDiscovery.java:75)
    at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1236)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
[2016-03-02 15:45:09,703][WARN ][com.amazonaws.metrics.AwsSdkMetrics]
java.security.AccessControlException: access denied ("javax.management.MBeanServerPermission" "findMBeanServer")
    at java.security.AccessControlContext.checkPermission(AccessControlContext.java:372)
    at java.security.AccessController.checkPermission(AccessController.java:559)
    at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
    at javax.management.MBeanServerFactory.checkPermission(MBeanServerFactory.java:413)
    at javax.management.MBeanServerFactory.findMBeanServer(MBeanServerFactory.java:361)
    at com.amazonaws.jmx.MBeans.getMBeanServer(MBeans.java:111)
    at com.amazonaws.jmx.MBeans.isRegistered(MBeans.java:98)
    at com.amazonaws.jmx.SdkMBeanRegistrySupport.isMBeanRegistered(SdkMBeanRegistrySupport.java:46)
    at com.amazonaws.metrics.AwsSdkMetrics.registerMetricAdminMBean(AwsSdkMetrics.java:361)
    at com.amazonaws.metrics.AwsSdkMetrics.<clinit>(AwsSdkMetrics.java:316)
    at com.amazonaws.AmazonWebServiceClient.requestMetricCollector(AmazonWebServiceClient.java:563)
    at com.amazonaws.AmazonWebServiceClient.isRMCEnabledAtClientOrSdkLevel(AmazonWebServiceClient.java:504)
    at com.amazonaws.AmazonWebServiceClient.isRequestMetricsEnabled(AmazonWebServiceClient.java:496)
    at com.amazonaws.AmazonWebServiceClient.createExecutionContext(AmazonWebServiceClient.java:457)
    at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5924)
    at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.fetchDynamicNodes(AwsEc2UnicastHostsProvider.java:118)
    at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:230)
    at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:215)
    at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:55)
    at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:104)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:335)
    at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.ping(UnicastZenPing.java:240)
    at org.elasticsearch.discovery.zen.ping.ZenPingService.ping(ZenPingService.java:106)
    at org.elasticsearch.discovery.zen.ping.ZenPingService.pingAndWait(ZenPingService.java:84)
    at org.elasticsearch.discovery.zen.ZenDiscovery.findMaster(ZenDiscovery.java:879)
    at org.elasticsearch.discovery.zen.ZenDiscovery.innerJoinCluster(ZenDiscovery.java:335)
    at org.elasticsearch.discovery.zen.ZenDiscovery.access$5000(ZenDiscovery.java:75)
    at org.elasticsearch.discovery.zen.ZenDiscovery$JoinThreadControl$1.run(ZenDiscovery.java:1236)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
[2016-03-02 15:45:39,688][WARN ][discovery                ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] waited for 30s and no initial state was set by the discovery
[2016-03-02 15:45:39,698][INFO ][http                     ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] publish_address {ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1/xxx-xxx-xx-xxx:9201}, bound_addresses {xxx-xxx-xx-xxx:9201}
[2016-03-02 15:45:39,699][INFO ][node                     ] [ip-xxx-xxx-xx-xxx.vpc.fakedomain.com-data1] started
2

2 Answers

2
votes

I fixed this by removing the explicit setting of transport.tcp.port: 9300 and using the default of letting it pick any ports in the range 9300-9399.

The warnings from the AwsSdkMetrics remain but are NOT an issue as Val stated.

1
votes

This is not actually an error.

See this issue where this has been reported. It just seems the plugin is logging too much.

If you modify your logging.yml config file as suggested in that issue with this, then you'll be fine:

# aws will try to do some sketchy JMX stuff, but its not needed.
com.amazonaws.jmx.SdkMBeanRegistrySupport: ERROR
com.amazonaws.metrics.AwsSdkMetrics: ERROR