2
votes

I am trying to use hazelcast v3.2.4 (same version on server and client). The server (as simple an implementation that I could put in place) is running on a server. The client tries to connect to the remote server - the server prints the authentication requests but I receive the following log outputs (including exceptions) - any ideas on what I can do differently (copying both log output and config file). I am trying to connect via TCP/IP and I checked for network connectivity - I could not see anything blocking the connection.

Line of Code mentioned in stack:

final ClientConfig config= new XmlClientConfigBuilder("config/hazelcast.xml").build();
HazelcastInstance hcast = HazelcastClient.newHazelcastClient(config);   //this is mentioned in stack trace

Config

<hazelcast-client xsi:schemaLocation="http://www.hazelcast.com/schema/config hazelcast-client-config-3.1.xsd"
           xmlns="http://www.hazelcast.com/schema/config"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <group>
        <name>dev</name> 
        <password>dev-pass</password> 
    </group>
    <management-center enabled="false">http://localhost:8080/mancenter</management-center>
    <network>
       <cluster-members>
            <address>xxx.xxx.xxx.xxx</address>
        </cluster-members>
        <smart-routing>true</smart-routing>
        <redo-operation>true</redo-operation>
        <connection-pool-size>30</connection-pool-size>

        <port auto-increment="true" port-count="100">5701</port>
        <outbound-ports>
            <ports>0</ports>
        </outbound-ports>
        <join>
            <multicast enabled="false">
                <multicast-group>224.2.2.3</multicast-group>
                <multicast-port>54327</multicast-port>
            </multicast>
            <tcp-ip enabled="false">
                <interface>xxx.xxx.xxx.xxx</interface>
            </tcp-ip>
            <aws enabled="false">
                <access-key>my-access-key</access-key>
                <secret-key>my-secret-key</secret-key>
                <region>us-west-1</region>
                <host-header>ec2.amazonaws.com</host-header>
                <security-group-name>hazelcast-sg</security-group-name>
                <tag-key>type</tag-key>
                <tag-value>hz-nodes</tag-value>
            </aws>
        </join>
        <interfaces enabled="false">
            <interface>10.10.1.*</interface>
        </interfaces>
        <ssl enabled="false" />
        <socket-interceptor enabled="false" />
        <symmetric-encryption enabled="false">
            <algorithm>PBEWithMD5AndDES</algorithm>
            <salt>thesalt</salt>
            <password>thepass</password>
            <iteration-count>19</iteration-count>
        </symmetric-encryption>
    </network>

Log output

Sep 05, 2014 4:06:02 PM com.hazelcast.core.LifecycleService
INFO: HazelcastClient[hz.client_0_dev][3.2.4] is STARTING
Sep 05, 2014 4:06:02 PM com.hazelcast.core.LifecycleService
INFO: HazelcastClient[hz.client_0_dev][3.2.4] is STARTED
Sep 05, 2014 4:06:02 PM com.hazelcast.core.LifecycleService
INFO: HazelcastClient[hz.client_0_dev][3.2.4] is CLIENT_CONNECTED
Sep 05, 2014 4:06:02 PM com.hazelcast.client.spi.ClientClusterService
INFO: 

Members [1] {
    Member [127.0.0.1]:5701
}

Sep 05, 2014 4:06:22 PM com.hazelcast.client.spi.ClientPartitionService
SEVERE: Error while fetching cluster partition table!
com.hazelcast.spi.exception.RetryableIOException: java.util.concurrent.ExecutionException: com.hazelcast.core.HazelcastException: java.net.ConnectException: Connection refused: no further information
    at com.hazelcast.client.connection.nio.ClientConnectionManagerImpl.getOrConnect(ClientConnectionManagerImpl.java:319)
    at com.hazelcast.client.connection.nio.ClientConnectionManagerImpl.tryToConnect(ClientConnectionManagerImpl.java:261)
    at com.hazelcast.client.spi.impl.ClientInvocationServiceImpl.send(ClientInvocationServiceImpl.java:149)
    at com.hazelcast.client.spi.impl.ClientInvocationServiceImpl.invokeOnTarget(ClientInvocationServiceImpl.java:59)
    at com.hazelcast.client.spi.impl.ClientPartitionServiceImpl.getPartitionsFrom(ClientPartitionServiceImpl.java:105)
    at com.hazelcast.client.spi.impl.ClientPartitionServiceImpl.getInitialPartitions(ClientPartitionServiceImpl.java:94)
    at com.hazelcast.client.spi.impl.ClientPartitionServiceImpl.start(ClientPartitionServiceImpl.java:60)
    at com.hazelcast.client.HazelcastClient.start(HazelcastClient.java:223)
    at com.hazelcast.client.HazelcastClient.newHazelcastClient(HazelcastClient.java:186)
    at com.xxx.test.HCastClientAccessor.getHCastInstance(HCastClientAccessor.java:55)
    at com.xxx.test.HCastTest.<clinit>(HCastTest.java:12)
Caused by: java.util.concurrent.ExecutionException: com.hazelcast.core.HazelcastException: java.net.ConnectException: Connection refused: no further information
    at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:262)
    at java.util.concurrent.FutureTask.get(FutureTask.java:119)
    at com.hazelcast.client.connection.nio.ClientConnectionManagerImpl.getOrConnect(ClientConnectionManagerImpl.java:316)
    ... 10 more
Caused by: com.hazelcast.core.HazelcastException: java.net.ConnectException: Connection refused: no further information
    at com.hazelcast.util.ExceptionUtil.rethrow(ExceptionUtil.java:45)
    at com.hazelcast.client.connection.nio.ClientConnectionManagerImpl$ConnectionProcessor.call(ClientConnectionManagerImpl.java:384)
    at com.hazelcast.client.connection.nio.ClientConnectionManagerImpl$ConnectionProcessor.call(ClientConnectionManagerImpl.java:332)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at com.hazelcast.util.executor.CompletableFutureTask.run(CompletableFutureTask.java:57)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:724)
    at com.hazelcast.util.executor.PoolExecutorThreadFactory$ManagedThread.run(PoolExecutorThreadFactory.java:59)
Caused by: java.net.ConnectException: Connection refused: no further information
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:708)
    at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:115)
    at com.hazelcast.client.connection.nio.ClientConnectionManagerImpl$ConnectionProcessor.call(ClientConnectionManagerImpl.java:365)
    ... 11 more

Server output

INFO: [127.0.0.1]:5701 [dev] [3.2.4] Accepting socket connection from /xxx.xxx.xxx.xxx:49705
Sep 05, 2014 4:05:57 PM com.hazelcast.nio.TcpIpConnectionManager
INFO: [127.0.0.1]:5701 [dev] [3.2.4] 5701 accepted socket connection from /xxx.xxx.xxx.xxx:49705
Sep 05, 2014 4:05:57 PM com.hazelcast.client.AuthenticationRequest
INFO: [127.0.0.1]:5701 [dev] [3.2.4] Received auth from Connection [/xxx.xxx.xxx.xxx:49705 -> null] live=true, client=true, type=JAVA_CLIENT, successfully authenticated
Sep 05, 2014 4:09:43 PM com.hazelcast.nio.TcpIpConnection
INFO: [127.0.0.1]:5701 [dev] [3.2.4] Connection [Address[xxx.xxx.xxx.xxx]:49705] lost. Reason: java.io.IOException[Connection reset by peer]
Sep 05, 2014 4:09:43 PM com.hazelcast.client.ClientEngine
INFO: [127.0.0.1]:5701 [dev] [3.2.4] Destroying ClientEndpoint{conn=Connection [/xxx.xxx.xxx.xxx:49705 -> Address[xxx.xxx.xxx.xxx]:49705] live=false, client=true, type=JAVA_CLIENT, uuid='70afcf60-96e0-444d-8981-3aa983530514', firstConnection=true, authenticated=true}
Sep 05, 2014 4:09:43 PM com.hazelcast.nio.ReadHandler
WARNING: [127.0.0.1]:5701 [dev] [3.2.4] hz._hzInstance_1_dev.IO.thread-in-0 Closing socket to endpoint Address[192.168.101.106]:49705, Cause:java.io.IOException: Connection reset by peer

Update:

I switched to this client config but I still get an exception on the client end. I copied the server and client output - the server receives the connection request but then on the client end, I see the same error as mentioned above "SEVERE: Error while fetching cluster partition table!" - same trace as above.

New client config

<hazelcast-client xsi:schemaLocation="http://www.hazelcast.com/schema/config hazelcast-client-config-3.2.4.xsd"
           xmlns="http://www.hazelcast.com/schema/config"
           xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <group>
        <name>dev</name> 
        <password>dev-pass</password> 
    </group>
    <management-center enabled="false">http://localhost:8080/mancenter</management-center>
    <network>
       <cluster-members>
            <address>xxx.xxx.xxx.xxx</address>
        </cluster-members>
        <smart-routing>true</smart-routing>
        <redo-operation>true</redo-operation>
        <connection-timeout>60000</connection-timeout>
        <connection-attempt-limit>10</connection-attempt-limit>
        <connection-pool-size>30</connection-pool-size>
    </network>
        <executor-pool-size>40</executor-pool-size> <!-- added -->

</hazelcast-client>

Server output:

Sep 07, 2014 5:57:01 PM com.hazelcast.nio.SocketAcceptor
INFO: [127.0.0.1]:5701 [dev] [3.2.4] Accepting socket connection from /xxx.xxx.xxx.xxx:58521
Sep 07, 2014 5:57:01 PM com.hazelcast.nio.TcpIpConnectionManager
INFO: [127.0.0.1]:5701 [dev] [3.2.4] 5701 accepted socket connection from /xxx.xxx.xxx.xxx:58521
Sep 07, 2014 5:57:03 PM com.hazelcast.client.AuthenticationRequest
INFO: [127.0.0.1]:5701 [dev] [3.2.4] Received auth from Connection [/xxx.xxx.xxx.xxx:58521 -> null] live=true, client=true, type=JAVA_CLIENT, successfully authenticated

Client output

Sep 07, 2014 5:58:04 PM com.hazelcast.client.spi.ClientPartitionService
SEVERE: Error while fetching cluster partition table!
com.hazelcast.spi.exception.RetryableIOException: java.util.concurrent.ExecutionException: com.hazelcast.core.HazelcastException: java.net.ConnectException: Connection refused: no further information
    at com.hazelcast.client.connection.nio.ClientConnectionManagerImpl.getOrConnect(ClientConnectionManagerImpl.java:319)
    at com.hazelcast.client.connection.nio.ClientConnectionManagerImpl.tryToConnect(ClientConnectionManagerImpl.java:261)
    at com.hazelcast.client.spi.impl.ClientInvocationServiceImpl.send(ClientInvocationServiceImpl.java:149)
    at com.hazelcast.client.spi.impl.ClientInvocationServiceImpl.invokeOnTarget(ClientInvocationServiceImpl.java:59)
    at com.hazelcast.client.spi.impl.ClientPartitionServiceImpl.getPartitionsFrom(ClientPartitionServiceImpl.java:105)
    at com.hazelcast.client.spi.impl.ClientPartitionServiceImpl.getInitialPartitions(ClientPartitionServiceImpl.java:94)
    at com.hazelcast.client.spi.impl.ClientPartitionServiceImpl.start(ClientPartitionServiceImpl.java:60)
    at com.hazelcast.client.HazelcastClient.start(HazelcastClient.java:223)
    at com.hazelcast.client.HazelcastClient.newHazelcastClient(HazelcastClient.java:186)

Update

I tried this for client and server and I got the same exception as above - what am I doing wrong:

Client ClientConfig clientConfig = new ClientConfig().addAddress("xxx.xxx.xxx.xxx"); HazelcastInstance client = HazelcastClient.newHazelcastClient(clientConfig);

Server

HazelcastInstance hcast = Hazelcast.newHazelcastInstance();

Update In short, make sure the config for both server and client are correct (do not mix up tags from one set to another). For the server, ensure that the hazelcast server is listening on the external IP address (not the loopback addresS), that there are no firewall settings blocking the connection (on client, server or in-between). Thanks to Peter - I'm back to using hazelcast and enjoying my experience of using this tool. Strongly recommended!

3
The Hazelcast client configuration isn't valid. It is a mixture of client and server configuration, for example <network><join> section doesn't exist for the client. Can you enable schema validation in your ide.pveentjer
Thanks to pveentjer - besides config and firewall changes, the interface needed to be the external IP address on which the server was supposed to listen on (as opposed to the local loopback address)user3813256

3 Answers

1
votes

Apart from the XML problems for the client, I don't see anything obvious wrong.

My usual approach is the following; try to run server and client on the same JVM, then at least you have made sure that there is no problem. I always verify the basics before I'm going to waste a second on network problems.

Once you have verified that, check the following section. My gut feeling is that there is something with the firewall. The first section contains configuration information about iptables, the second contains a second to test the network connection.

\subsection{iptables} If you are making use of iptables, the following rule can be added to allow for outbound traffic from ports 33000-31000: \begin{lstlisting} iptables -A OUTPUT -p TCP --dport 33000:31000 -m state --state NEW -j ACCEPT \end{lstlisting} and to control incoming traffic from any address to port 5701: \begin{lstlisting} iptables -A INPUT -p tcp -d 0/0 -s 0/0 --dport 5701 -j ACCEPT \end{lstlisting} and to allow incoming multicast traffic: \begin{lstlisting} iptables -A INPUT -m pkttype --pkt-type multicast -j ACCEPT \end{lstlisting}

\section{Connectivity test} If you are having troubles because machines won't join a cluster, you might check the network connectity between the 2 machines. You can use a tool called iperf for that. On one machine you execute: \begin{lstlisting} iperf -s -p 5701 \end{lstlisting} This means that you are listening at port 5701.

At the other machine you execute the following command: \begin{lstlisting} iperf -c 192.168.1.107 -d -p 5701 \end{lstlisting} Where you replace '192.168.1.107' by the ip address of your first machine. If you run the command and you get output like this:

\begin{lstlisting}

Server listening on TCP port 5701

TCP window size: 85.3 KByte (default)


Client connecting to 192.168.1.107, TCP port 5701

TCP window size: 59.4 KByte (default)

[ 5] local 192.168.1.105 port 40524 connected with 192.168.1.107 port 5701 [ 4] local 192.168.1.105 port 5701 connected with 192.168.1.107 port 33641 [ ID] Interval Transfer Bandwidth [ 4] 0.0-10.2 sec 55.8 MBytes 45.7 Mbits/sec [ 5] 0.0-10.3 sec 6.25 MBytes 5.07 Mbits/sec \end{lstlisting} You know the 2 machines can connect to each other. However if you are seeing something like this: \begin{lstlisting} Server listening on TCP port 5701

TCP window size: 85.3 KByte (default)

connect failed: No route to host \end{lstlisting} Then you know that you might have a network connection problem on your hands.

0
votes

The error you're mentioning appears to come from the following (ClientPartitionServiceImpl) where it calls the GetPartitionRequest and passes in the address. Can you check what you are passing in as address (may require for you to build hazelcast with the version you're using) or check the interfaces/config file being used in more detail.

public ClientPartitionServiceImpl(HazelcastClient client) {
    this.client = client;
}

public void start() {
    getInitialPartitions();
    client.getClientExecutionService().scheduleWithFixedDelay(new RefreshTask(), INITIAL_DELAY, PERIOD, TimeUnit.SECONDS);
}

public void refreshPartitions() {
    try {
        client.getClientExecutionService().execute(new RefreshTask());
    } catch (RejectedExecutionException ignored) {
        EmptyStatement.ignore(ignored);
    }
}

private class RefreshTask implements Runnable {
    public void run() {
        if (updating.compareAndSet(false, true)) {
            try {
                final ClientClusterService clusterService = client.getClientClusterService();
                final Address master = clusterService.getMasterAddress();
                final PartitionsResponse response = getPartitionsFrom(master);
                if (response != null) {
                    processPartitionResponse(response);
                }
            } catch (HazelcastInstanceNotActiveException ignored) {
                EmptyStatement.ignore(ignored);
            } finally {
                updating.set(false);
            }
        }
    }
}

private void getInitialPartitions() {
    final ClientClusterService clusterService = client.getClientClusterService();
    final Collection<MemberImpl> memberList = clusterService.getMemberList();
    for (MemberImpl member : memberList) {
        final Address target = member.getAddress();
        PartitionsResponse response = getPartitionsFrom(target);
        if (response != null) {
            processPartitionResponse(response);
            return;
        }
    }
    throw new IllegalStateException("Cannot get initial partitions!");
}

private PartitionsResponse getPartitionsFrom(Address address) {
    try {
        final Future<PartitionsResponse> future =
                client.getInvocationService().invokeOnTarget(new GetPartitionsRequest(), address);
        return client.getSerializationService().toObject(future.get());
    } catch (Exception e) {
        LOGGER.severe("Error while fetching cluster partition table!", e);
    }
    return null;
}

GetPartitionsRequest

public final class GetPartitionsRequest extends CallableClientRequest implements Portable, RetryableRequest {

    @Override
    public Object call() throws Exception {
        InternalPartitionService service = getService();
        service.firstArrangement();
        ClusterService clusterService = getClientEngine().getClusterService();
        Collection<MemberImpl> memberList = clusterService.getMemberList();
        Address[] addresses = new Address[memberList.size()];
        Map<Address, Integer> addressMap = new HashMap<Address, Integer>(memberList.size());
        int k = 0;
        for (MemberImpl member : memberList) {
            Address address = member.getAddress();
            addresses[k] = address;
            addressMap.put(address, k);
            k++;
        }
        InternalPartition[] partitions = service.getPartitions();
        int[] indexes = new int[partitions.length];
        for (int i = 0; i < indexes.length; i++) {
            Address owner = partitions[i].getOwnerOrNull();
            int index = -1;
            if (owner != null) {
                final Integer idx = addressMap.get(owner);
                if (idx != null) {
                    index = idx;
                }

            }
            indexes[i] = index;
        }
        return new PartitionsResponse(addresses, indexes);
    }

    @Override
    public String getServiceName() {
        return InternalPartitionService.SERVICE_NAME;
    }

    @Override
    public int getFactoryId() {
        return ClientPortableHook.ID;
    }

    @Override
    public int getClassId() {
        return ClientPortableHook.GET_PARTITIONS;
    }

    @Override
    public Permission getRequiredPermission() {
        return null;
    }
}
0
votes

Just will add some situations, that I faced for the "Error while fetching cluster partition table"

  • for ipv6: check http://docs.hazelcast.org/docs/latest/manual/html/ipv6.html page. If you use ipv6 don't foget to set "hazelcast.prefer.ipv4.stack" to true on server
  • for ipv6: for some reasons everything still doesn't work for client if you don't set "" For ipv4 it works with "enabled=false". I use one single machine and don't need the cluster, but need clients connected. Magic: client can connect to server with "tcp-ip enabled=false" if using ipv4, but cannot if using ipv6.