0
votes

I'm trying to set up an Ignite cluster with SSL encryption in my Spring application. My target is to set up a replicated cache over several nodes.

We deploy our application into a Tomcat 8 and set environment variables for our Key- and Truststore at startup of the Tomcat.

I want to start Ignite embedded in my Spring application. So i create a Bean which returns a CacheManager.

@Bean
public SpringCacheManager replicatedCache() {

    int[] eventTypes = new int[] {EventType.EVT_CACHE_ENTRY_EVICTED, EventType.EVT_CACHE_OBJECT_REMOVED, EventType.EVT_CACHE_ENTRY_DESTROYED, EventType.EVT_CACHE_OBJECT_EXPIRED};

    SpringCacheManager cacheManager = new SpringCacheManager();

    IgniteConfiguration configuration = new IgniteConfiguration();
    configuration.setIncludeEventTypes(eventTypes);
    configuration.setGridName("igniteCluster");

    Slf4jLogger logger = new Slf4jLogger(LoggerFactory.getLogger(IGNITE_CACHE_LOGGER_NAME));
    configuration.setGridLogger(logger);

    CacheConfiguration cacheConfiguration1 = new CacheConfiguration();
    cacheConfiguration1.setName("replicatedCache");
    cacheConfiguration1.setCacheMode(CacheMode.REPLICATED);
    cacheConfiguration1.setAtomicityMode(CacheAtomicityMode.TRANSACTIONAL);

    configuration.setCacheConfiguration(cacheConfiguration1);

    configuration.setSslContextFactory(() -> {
        try {
            return SSLContext.getDefault();
        } catch (NoSuchAlgorithmException e) {
            throw new WA3InternalErrorException("Could not create SSLContext", e);
        }
    });
    configuration.setLocalHost(env.getProperty("caching.localBind", "0.0.0.0"));

    TcpDiscoveryVmIpFinder ipFinder = new TcpDiscoveryVmIpFinder();
    List<String> nodes = Arrays.stream(env.getRequiredProperty("caching.nodes").split(",")).collect(Collectors.toList());
    ipFinder.setAddresses(nodes);
    TcpDiscoverySpi spi = new TcpDiscoverySpi();
    spi.setIpFinder(ipFinder);
    configuration.setDiscoverySpi(spi);

    TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi();
    communicationSpi.setLocalPort(env.getRequiredProperty("caching.localPort", Integer.class));
    communicationSpi.setConnectTimeout(100000); // Line added in first edit
    configuration.setCommunicationSpi(communicationSpi);

    IgnitePredicate<? extends CacheEvent> localEvent = event -> {
        System.out.println(event);
        return true;
    };

    Map<IgnitePredicate<? extends Event>, int[]> ignitePredicateIntegerMap = Collections.singletonMap(localEvent, eventTypes);
    configuration.setLocalEventListeners(ignitePredicateIntegerMap);

    cacheManager.setConfiguration(configuration);

    return cacheManager;
}

As you can see, i also configure that Ignite here. Binding to the IP-adress of the server and setting a port (which is 47100 like the default port) to the CommunicationSpi. I am using SSLContext.getDefault() here, so it is using the default Key- and Truststores.

Everything works, when SSL is disabled (not setting SSLContextFactory). But as soon as I set the Factory, the nodes can still find, but can't communicate with each other.

The metrics log looks fine, 2 nodes as expected:

    Metrics for local node (to disable set 'metricsLogFrequency' to 0)
        ^-- Node [id=41687971, name=igniteCluster, uptime=00:54:00:302]
        ^-- H/N/C [hosts=2, nodes=2, CPUs=4]
        ^-- CPU [cur=33.5%, avg=36.96%, GC=0%]
        ^-- Heap [used=193MB, free=85.51%, comm=627MB]
        ^-- Non heap [used=125MB, free=-1%, comm=127MB]
        ^-- Public thread pool [active=0, idle=2, qSize=0]
        ^-- System thread pool [active=0, idle=7, qSize=0]
        ^-- Outbound messages queue [size=0]

What i can see so far is, that Ignite is trying to connect on a port - which fails, increments that port and tries again.

2017-05-02T08:15:35,154 []  [] [grid-nio-worker-tcp-comm-1-#18%igniteCluster%] WARN  org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi [warning():104] [] - Communication SPI session write timed out (consider increasing 'socketWriteTimeout' configuration property) [remoteAddr=/10.30.0.106:53603, writeTimeout=2000]
2017-05-02T08:15:39,192 []  [] [grid-nio-worker-tcp-comm-2-#19%igniteCluster%] WARN  org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi [warning():104] [] - Communication SPI session write timed out (consider increasing 'socketWriteTimeout' configuration property) [remoteAddr=/10.30.0.106:53604, writeTimeout=2000]

I don't know what port that is. I have restarted all nodes several times and it looks like it is starting at a random port between 30000 and 50000.

My final questions are: What am I missing here? Why does my SSL connection not work?

Regards


I have increased the timeout, as Valentin suggested. Still have problems with my cluster.

    2017-05-03T12:19:29,429 []  [] [localhost-startStop-1] WARN  org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager [warning():104] [] - Failed to wait for initial partition map exchange. Possible reasons are:
      ^-- Transactions in deadlock.
      ^-- Long running transactions (ignore if this is the case).
      ^-- Unreleased explicit locks.

I get these log messages on the node which tries to connect to the cluster.

1

1 Answers

0
votes

Try to increase socketWriteTimeout, as error message suggests. SSL connection is slower and there is a chance that default values are not enough for it in your network.