5
votes

I'm adding a second node to a single-node cassandra cluster, and getting a stack trace on the second node:

ERROR 18:13:42,841 Exception encountered during startup
java.lang.RuntimeException: Unable to gossip with any seeds
        at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1193)
        at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:446)
        at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:611)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:504)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
java.lang.RuntimeException: Unable to gossip with any seeds
        at org.apache.cassandra.gms.Gossiper.doShadowRound(Gossiper.java:1193)
        at org.apache.cassandra.service.StorageService.checkForEndpointCollision(StorageService.java:446)
        at org.apache.cassandra.service.StorageService.prepareToJoin(StorageService.java:655)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:611)
        at org.apache.cassandra.service.StorageService.initServer(StorageService.java:504)
        at org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:378)
        at org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:496)
        at org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:585)
Exception encountered during startup: Unable to gossip with any seeds
ERROR 18:13:42,885 Exception in thread Thread[StorageServiceShutdownHook,5,main]
java.lang.NullPointerException
        at org.apache.cassandra.gms.Gossiper.stop(Gossiper.java:1270)
        at org.apache.cassandra.service.StorageService$1.runMayThrow(StorageService.java:572)
        at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
        at java.lang.Thread.run(Thread.java:744)

There are other SO questions with this same issue, but none of the answers have worked for me:

Apache Cassandra: Unable to gossip with any seeds

new cassandra node can't gossip with seed

Datastax Enterprise is crashing with Unable to gossip with any seeds error

I'm running Cassandra 2.0.8 and jdk 1.7.0_51 on both nodes. One node is hosted at DigitalOcean, the other at Linode. I've tried configuring them as the same datacenter and as different datacenters in cassandra-rackdc.properties, it makes no difference. I've tried listen_address and broadcast_address blank and hardcoded, makes no difference. I did limit the list of cipher suites to stop a flood of log messages about missing cipher suites. From the stock cassandra.yaml, I've changed the following entries, excluding entries related to concurrent writes and compaction. For the sake of this question, wherever there's a hardcoded ip address in the config, I've replaced those with . Each box has a firewall, but I've tried it with the firewalls disabled. I've also tried it with ''internode_encryption: none'' and the result is the same. I've used telnet and netcat to confirm that each host can connect to the other's port 7000 and 7001.

on the original host:

- seeds: "<host1>"
listen_address:
broadcast_address:
endpoint_snitch: GossipingPropertyFileSnitch
internode_encryption: all
cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA]

on the new host:

- seeds: "<host1>"
listen_address:
broadcast_address:
endpoint_snitch: GossipingPropertyFileSnitch
internode_encryption: all
cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA]

Edit:

Also, using netstat I can see that the new server successfully establishes a tcp connection to port 7001 of the original server.

Edit:

Okay, next day. I've upgraded to java 1.7.0_60 on both machines. Gossip now works with internode_encryption: none. I very much doubt the new result is related to the change in JDK; it's more likely related to some carelessness in scrubbing directories or the like.

I've commented the line in each config file that lists ciphers. Gossip still fails in the same way with internode_encryption: all. The seed node's logs are clean, but the other node logs Filtering out TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it isnt supported by the socket repeatedly until gossip fails. I think the log entry is related to the failure. Why one logs this and the other not, I don't know. They're both debian running the same jdk version.

Edit:

Installing the JCE on both nodes made the filtering warning go away. Still no encrypted internode communication at this point.

Edit:

With debug turned on, the seed node logs:

DEBUG 22:44:57,409 Error reading the socket d862c40[SSL_NULL_WITH_NULL_NULL: Socket[addr=/10.128.139.94,port=60611,localport=7001]]
javax.net.ssl.SSLHandshakeException: no cipher suites in common

I've pretty carefully created the certs for both servers, following the instructions at http://www.datastax.com/documentation/cassandra/2.0/cassandra/security/secureSSLCertificates_t.html?scroll=task_ds_c14_xjy_2k.

1

1 Answers

3
votes

It's now working using either unencrypted or encrypted communications. Encrypted communications started working after installing the JCE extensions on both servers, and making a change in certificate generation. The Datastax instructions for preparing server certificates for Cassandra 2.0 drops a parameter that was present in their Cassandra 1.2 instructions. Including the parameter seemed to make the difference. The additional parameter is -keyalg RSA:

Seed server:
keytool -genkey -alias prod01 -keystore .keystore -keyalg RSA
keytool -export -alias prod01 -file prod01.cer -keystore .keystore -keyalg RSA

Other server:
keytool -genkey -alias prod00 -keystore .keystore -keyalg RSA
keytool -export -alias prod00 -file prod00.cer -keystore .keystore -keyalg RSA

Then, make sure both servers have both certs, and use them to create a trust store using these commands on both servers:

keytool -import -v -trustcacerts -alias prod00 -file prod00.cer -keystore .truststore
keytool -import -v -trustcacerts -alias prod01 -file prod01.cer -keystore .truststore
chmod go-rwx .keystore