I'm trying to set up an Ignite cluster with SSL encryption in my Spring application. My target is to set up a replicated cache over several nodes.
We deploy our application into a Tomcat 8 and set environment variables for our Key- and Truststore at startup of the Tomcat.
I want to start Ignite embedded in my Spring application. So i create a Bean which returns a CacheManager.
@Bean
public SpringCacheManager replicatedCache() {
int[] eventTypes = new int[] {EventType.EVT_CACHE_ENTRY_EVICTED, EventType.EVT_CACHE_OBJECT_REMOVED, EventType.EVT_CACHE_ENTRY_DESTROYED, EventType.EVT_CACHE_OBJECT_EXPIRED};
SpringCacheManager cacheManager = new SpringCacheManager();
IgniteConfiguration configuration = new IgniteConfiguration();
configuration.setIncludeEventTypes(eventTypes);
configuration.setGridName("igniteCluster");
Slf4jLogger logger = new Slf4jLogger(LoggerFactory.getLogger(IGNITE_CACHE_LOGGER_NAME));
configuration.setGridLogger(logger);
CacheConfiguration cacheConfiguration1 = new CacheConfiguration();
cacheConfiguration1.setName("replicatedCache");
cacheConfiguration1.setCacheMode(CacheMode.REPLICATED);
cacheConfiguration1.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);
configuration.setCacheConfiguration(cacheConfiguration1);
configuration.setSslContextFactory(() -> {
try {
return SSLContext.getDefault();
} catch (NoSuchAlgorithmException e) {
throw new WA3InternalErrorException("Could not create SSLContext", e);
}
});
configuration.setLocalHost(env.getProperty("caching.localBind", "0.0.0.0"));
TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
List<String> nodes = Arrays.stream(env.getRequiredProperty("caching.nodes").split(",")).collect(Collectors.toList());
ipFinder.setAddresses(nodes);
TcpDiscoverySpi spi = new TcpDiscoverySpi();
spi.setIpFinder(ipFinder);
configuration.setDiscoverySpi(spi);
TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi();
communicationSpi.setLocalPort(env.getRequiredProperty("caching.localPort", Integer.class));
communicationSpi.setConnectTimeout(100000); // Line added in first edit
configuration.setCommunicationSpi(communicationSpi);
IgnitePredicate<? extends CacheEvent> localEvent = event -> {
System.out.println(event);
return true;
};
Map<IgnitePredicate<? extends Event>, int[]> ignitePredicateIntegerMap = Collections.singletonMap(localEvent, eventTypes);
configuration.setLocalEventListeners(ignitePredicateIntegerMap);
cacheManager.setConfiguration(configuration);
return cacheManager;
}
As you can see, i also configure that Ignite here.
Binding to the IP-adress of the server and setting a port (which is 47100 like the default port) to the CommunicationSpi.
I am using SSLContext.getDefault() here, so it is using the default Key- and Truststores.
Everything works, when SSL is disabled (not setting SSLContextFactory). But as soon as I set the Factory, the nodes can still find, but can't communicate with each other.
The metrics log looks fine, 2 nodes as expected:
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
^-- Node [id=41687971, name=igniteCluster, uptime=00:54:00:302]
^-- H/N/C [hosts=2, nodes=2, CPUs=4]
^-- CPU [cur=33.5%, avg=36.96%, GC=0%]
^-- Heap [used=193MB, free=85.51%, comm=627MB]
^-- Non heap [used=125MB, free=-1%, comm=127MB]
^-- Public thread pool [active=0, idle=2, qSize=0]
^-- System thread pool [active=0, idle=7, qSize=0]
^-- Outbound messages queue [size=0]
What i can see so far is, that Ignite is trying to connect on a port - which fails, increments that port and tries again.
2017-05-02T08:15:35,154 [] [] [grid-nio-worker-tcp-comm-1-#18%igniteCluster%] WARN org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi [warning():104] [] - Communication SPI session write timed out (consider increasing 'socketWriteTimeout' configuration property) [remoteAddr=/10.30.0.106:53603, writeTimeout=2000] 2017-05-02T08:15:39,192 [] [] [grid-nio-worker-tcp-comm-2-#19%igniteCluster%] WARN org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi [warning():104] [] - Communication SPI session write timed out (consider increasing 'socketWriteTimeout' configuration property) [remoteAddr=/10.30.0.106:53604, writeTimeout=2000]
I don't know what port that is. I have restarted all nodes several times and it looks like it is starting at a random port between 30000 and 50000.
My final questions are: What am I missing here? Why does my SSL connection not work?
Regards
I have increased the timeout, as Valentin suggested. Still have problems with my cluster.
2017-05-03T12:19:29,429 [] [] [localhost-startStop-1] WARN org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager [warning():104] [] - Failed to wait for initial partition map exchange. Possible reasons are:
^-- Transactions in deadlock.
^-- Long running transactions (ignore if this is the case).
^-- Unreleased explicit locks.
I get these log messages on the node which tries to connect to the cluster.