3
votes

We're getting strange behaviour from a cassandra cluster (1.0.10).

We're running a 3-node cluster.

If I create a keyspace without setting the replication factor, then I get errors when trying to input data:

[default@unknown] create keyspace foo;
ae639ba0-d4b8-11e1-0000-424d3d43a8df
Waiting for schema agreement...
Warning: unreachable nodes 10.227.65.172, 10.51.62.63... schemas agree across the cluster
[default@unknown] use foo;
Authenticated to keyspace: foo
[default@foo] create column family User with comparator = UTF8Type;
b4608180-d4b8-11e1-0000-424d3d43a8df
Waiting for schema agreement...
Warning: unreachable nodes 10.227.65.172, 10.51.62.63... schemas agree across the cluster
[default@foo] update column family User with
...             column_metadata =
...             [
...             {column_name: first, validation_class: UTF8Type},
...             {column_name: last, validation_class: UTF8Type},
...             {column_name: age, validation_class: UTF8Type, index_type: KEYS}
...             ];
b70562c0-d4b8-11e1-0000-424d3d43a8df
Waiting for schema agreement...
Warning: unreachable nodes 10.227.65.172, 10.51.62.63... schemas agree across the cluster
[default@foo] set User['jsmith']['first'] = 'John';
null
UnavailableException()
        at org.apache.cassandra.thrift.Cassandra$insert_result.read(Cassandra.java:15206)
        at org.apache.cassandra.thrift.Cassandra$Client.recv_insert(Cassandra.java:858)
        at org.apache.cassandra.thrift.Cassandra$Client.insert(Cassandra.java:830)
        at org.apache.cassandra.cli.CliClient.executeSet(CliClient.java:901)
        at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:218)
        at org.apache.cassandra.cli.CliMain.processStatementInteractive(CliMain.java:220)
        at org.apache.cassandra.cli.CliMain.main(CliMain.java:348)

(The problems with unable to reach nodes should not be a problem as stated here)

However, if I create the keyspace and specify a replication factor (1, 2 or 3), then it works fine.

If there is a problem in that creating a cluster without specifying the replication factor, then shouldn't an exception be thrown at creation-time instead? What is the expected behaivour if you don't specify a Replication Factor on a multi-node cluster?

1
It does sound like a bug. You can file a bug report at: issues.apache.org/jira/browse/CASSANDRA - psanford

1 Answers

3
votes

The default replication strategy when creating a keyspace from inside cassandra-cli is NetworkTopologyStrategy (NTS), which doesn't actually have a concept of a single replication_factor. Replicas for NTS are configured on a per-datacenter basis. The default replication options when using NTS are "{datacenter1:1}", meaning one replica should be put in the "datacenter1" replica group. If you don't have a particular snitch configured, then most likely all nodes are being assigned to "datacenter1".

I'm confused about how you were setting the replication factor to 1, 2, or 3, because cassandra-cli should not let you specify replication_factor without also specifying a placement_strategy of SimpleStrategy too, and if you were doing that, I'd think you would be more aware of that difference.

Anyway, since your effective replication factor in the default case is 1, I expect that your problem really is the down nodes from the warning messages. Are they really zombie nodes, as are discussed in the mail you cited, or are they real nodes which are still in the ring and which are unreachable? The output of nodetool ring should help diagnose why Cassandra doesn't think it can store your records successfully.

Finally, I should point out that you'll find this sort of work lots easier with the cqlsh tool than with cassandra-cli. In this case, it would at least have forced you to give an explicit replication strategy and strategy options.