So I'm trying to setup an Ignite cluster with this default-config.xml for both nodes:
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.springframework.org/schema/beans
http://www.springframework.org/schema/beans/spring-beans.xsd">
<bean id="grid.cfg" class="org.apache.ignite.configuration.IgniteConfiguration">
<property name="workDirectory" value="/mnt/e/apache-ignite"/>
<property name="discoverySpi">
<bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
<property name="ipFinder">
<bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.multicast.TcpDiscoveryMulticastIpFinder">
<property name="addresses">
<list>
<value>node1:47500..47509</value>
<value>node2:47500..47509</value>
</list>
</property>
</bean>
</property>
</bean>
</property>
<property name="communicationSpi">
<bean class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">
<property name="localPort" value="47100"/>
</bean>
</property>
<property name="fileSystemConfiguration">
<list>
<bean class="org.apache.ignite.configuration.FileSystemConfiguration">
<property name="name" value="igfs"/>
<property name="ipcEndpointConfiguration">
<bean class="org.apache.ignite.igfs.IgfsIpcEndpointConfiguration">
<property name="type" value="TCP"/>
<property name="host" value="0.0.0.0"/>
<property name="port" value="10500"/>
</bean>
</property>
<property name="secondaryFileSystem">
<bean class="org.apache.ignite.hadoop.fs.IgniteHadoopIgfsSecondaryFileSystem">
<property name="fileSystemFactory">
<bean class="org.apache.ignite.hadoop.fs.CachingHadoopFileSystemFactory">
<property name="uri" value="hdfs://node1:9000/"/>
<property name="configPaths">
<list>
<value>/mnt/e/hadoop/etc/hadoop/core-site.xml</value>
</list>
</property>
</bean>
</property>
</bean>
</property>
</bean>
</list>
</property>
</bean>
I can start both nodes separately without any problems using ignite.sh
. But when I try to join both nodes I keep getting following error:
class org.apache.ignite.IgniteException: Failed to start manager: GridManagerAdapter [enabled=true, name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
at org.apache.ignite.internal.util.IgniteUtils.convertException(IgniteUtils.java:1067)
at org.apache.ignite.Ignition.start(Ignition.java:349)
at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:300)
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to start manager: GridManagerAdapter [enabled=true, name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
at org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1965)
at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1276)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2045)
at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1703)
at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1117)
at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1035)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:921)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:820)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:690)
at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:659)
at org.apache.ignite.Ignition.start(Ignition.java:346)
... 1 more
Caused by: class org.apache.ignite.IgniteCheckedException: Failed to start SPI: TcpDiscoverySpi [addrRslvr=null, sockTimeout=5000, ackTimeout=5000, marsh=JdkMarshaller [clsFilter=org.apache.ignite.marshaller.MarshallerUtils$1@1b5c3e5f], reconCnt=10, reconDelay=2000, maxAckTimeout=600000, soLinger=5, forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null, skipAddrsRandomization=false]
at org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:302)
at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:943)
at org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1960)
... 11 more
Caused by: class org.apache.ignite.spi.IgniteSpiException: Impossible to continue join, check if local discovery and communication ports are not blocked with firewall [addr=OmUsVdiDist0221/192.168.175.221:47500, req=TcpDiscoveryJoinRequestMessage [node=TcpDiscoveryNode [id=98e6971f-b477-4518-a25d-1d8ff8a33c46, consistentId=0:0:0:0:0:0:0:1%lo,127.0.0.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1%lo, 127.0.0.1], sockAddrs=HashSet [/0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500], discPort=47500, order=0, intOrder=0, lastExchangeTime=1602083346717, loc=true, ver=2.8.1#20200521-sha1:86422096, isClient=false], dataPacket=org.apache.ignite.spi.discovery.tcp.internal.DiscoveryDataPacket@2b59501e, super=TcpDiscoveryAbstractMessage [sndNodeId=null, id=3a6cb930571-98e6971f-b477-4518-a25d-1d8ff8a33c46, verifierNodeId=null, topVer=0, pendingIdx=0, failedNodes=null, isClient=false]], discoLocalPort=47500, discoLocalPortRange=100]
at org.apache.ignite.spi.discovery.tcp.ServerImpl.sendJoinRequestMessage(ServerImpl.java:1292)
at org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:1032)
at org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:427)
at org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2099)
at org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:299)
... 13 more
Failed to start grid: Failed to start manager: GridManagerAdapter [enabled=true, name=org.apache.ignite.internal.managers.discovery.GridDiscoveryManager]
I'm currently using the latest version of Ignite (2.8.1) and two Windows machines (running Ignite on wsl). I don't think a firewall is blocking the discovery or communication ports because using telnet when one of the nodes is started works without any problems.
administrator@node1:~$ telnet <node2-ip> 47100
Trying <node2-ip>...
Connected to <node2-ip>.
Escape character is '^]'.
0[GOi#^CConnection closed by foreign host.
administrator@node1:~$ telnet <node2-ip> 47500
Trying <node2-ip>...
Connected to <node2-ip>.
Escape character is '^]'.
Connection closed by foreign host.
I'm a little lost here. Maybe I'm doing something wrong in my configuration ?
EDIT
When running ignite.bat
on the second worker using powershell, the node is added to the topology without any problems.