cassandra replica exception HUnavailableException

Question

I have a cassandra 2 datacenter pair with single replication with each datacenter containing a single node and each datacenter located on separate physical servers on the network. If one datacenter crashes, the other one will continue to be available for reads and writes I started up my java application, on a 3rd server, and everything it running ok. It's reading and writing to cassandra.

Next I disconnected, pulled the network cable, the 2nd datacenter server from the network. I expected the application to continue running with no exceptions against the 1st datacenter, but that was not the case.

The following exception started to occur in the application:

me.prettyprint.hector.api.exceptions.HUnavailableException: : May not be enough replicas present to handle consistency level.
        at me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:60)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl$9.execute(KeyspaceServiceImpl.java:354)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl$9.execute(KeyspaceServiceImpl.java:343)
        at me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:101)
        at me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:232)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.operateWithFailover(KeyspaceServiceImpl.java:131)
        at me.prettyprint.cassandra.service.KeyspaceServiceImpl.getSuperColumn(KeyspaceServiceImpl.java:360)
        at me.prettyprint.cassandra.model.thrift.ThriftSuperColumnQuery$1.doInKeyspace(ThriftSuperColumnQuery.java:51)
        at me.prettyprint.cassandra.model.thrift.ThriftSuperColumnQuery$1.doInKeyspace(ThriftSuperColumnQuery.java:45)
        at me.prettyprint.cassandra.model.KeyspaceOperationCallback.doInKeyspaceAndMeasure(KeyspaceOperationCallback.java:20)
        at me.prettyprint.cassandra.model.ExecutingKeyspace.doExecute(ExecutingKeyspace.java:85)
        at me.prettyprint.cassandra.model.thrift.ThriftSuperColumnQuery.execute(ThriftSuperColumnQuery.java:44)

Once I reconnected the network cable to the 2nd server, the error stopped.

Here's more details on cassandra 1.0.10

1) Here's the following describe from cassandra on both datacenters

Keyspace: AdvancedAds:
Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy
Durable Writes: true
Options: [DC2:1, DC1:1]

2) I ran a node tool ring against each instance

./nodetool -h 111.111.111.111 -p 11000 ring
Address DC Rack Status State Load Owns Token
1
111.111.111.111 DC1 RAC1 # <-- usUp Normal 1.07 GB 100.00% 0
111.111.111.222 DC2 RAC1 Up Normal 1.1 GB 0.00% 1

./nodetool -h 111.111.111.222 ring -port 11000
Address DC Rack Status State Load Owns Token
1
111.111.111.111 DC1 RAC1 Up Normal 1.07 GB 100.00% 0
111.111.111.222 DC2 RAC1 # <-- usUp Normal 1.1 GB 0.00% 1

3) I checked the cassandra.yaml

the seeds are 111.111.111.111, 111.111.111.222

4) I checked the cassandra-topology.properties

111.111.111.111

    # Cassandra Node IP=Data Center:Rack

    # datacenter 1
    111.111.111.111=DC1:RAC1 # <-- us

    # datacenter 2
    111.111.111.222=DC2:RAC1

    default=DC1:r1

111.111.111.222

    # Cassandra Node IP=Data Center:Rack

    # datacenter 1
    111.111.111.111=DC1:RAC1

    # datacenter 2
    111.111.111.222=DC2:RAC1 # <-- us

    default=DC1:r1

5) we set the consistencyLevel to LOCAL_QUORUM in our java application as follows:

public Keyspace getKeyspace(final String keyspaceName, final String serverAddresses)
{        
    Keyspace ks = null;
    Cluster c = clusterMap.get(serverAddresses);
    if (c != null)
    {            
        ConfigurableConsistencyLevel policy = new ConfigurableConsistencyLevel();
        policy.setDefaultReadConsistencyLevel(consistencyLevel);
        policy.setDefaultWriteConsistencyLevel(consistencyLevel);

        // Create Keyspace
        ks = HFactory.createKeyspace(keyspaceName, c, policy);
    }        
    return ks;
}

I was told this configuration would work, but maybe I'm missing something.

Thanks for any insight

jbellis jbellis · Accepted Answer · 2013-08-19T22:55:42

Hector is known to return spurious unavailable errors. The native protocol Java driver does not have this problem: https://github.com/datastax/java-driver

cassandra replica exception HUnavailableException

3 Answers