4
votes

We have a cassandra cluster running with 3 nodes and a replication factor of 2 -> maybe we should have selected 3 from the start, but this is not the case.

Our quorum is therefore = 2/2 + 1 = 2

Lets say we lose one node - so now only two cassandra nodes are online.

We still have the possibility to read from the cluster if we set our consistency level to "ONE" and then read -> so this is not a problem.

The thing I do not understand is the following.

We still have two nodes running, so why is it not possible to do a serial (lightweight transaction) insert into our keyspace? We have two nodes up, so shouldn't it be possible to get a quorum of 2 when trying to insert?

Is it because one of the row's is already put on the missing node?

2

2 Answers

4
votes

When you are trying to insert a data, the data is stored based on the token values(based on the partitioner configured) and replicated in a circular way.

For e.g. If you are inserting a data X in a keyspace with replication factor of 2 in a 3 node cluster Node1 (owning token A), Node2 (owning token B) and Node3 (Owning token C). Say if the data X is computed to token B, then Cassandra starts inserting data from Node2 and Node3 (till it completes the replicas). Say if the data X is computed to token C, then Cassandra starts inserting data from Node3 and Node1.

So setting consistency level of 2 means the data must be written in 2 nodes. In your case even though you have 2 nodes up Node1 (token A) and Node2 (token B) and one node down Node3 (token C), if the data is computed and selected as token B, then Cassandra tries to insert in Node2 and Node3 and you get consistency error as it cannot insert in Node3.

So to insert you must either increase replication to 3 or decrease the consistency to 1.

To know more on consistency see this docs https://docs.datastax.com/en/cassandra/2.1/cassandra/dml/dml_config_consistency_c.html

2
votes

Lightweight transactions require a QUORUM consistency level, which cannot be reached in case the unavailable node is a replica of the affected key. What's relevant here is the number of available replicas, not the number of nodes in the cluster.