0
votes

Setup I have 4 nodes for cassandra cluster (same datacenter). Replication factor is 3. Write consistency is set to ALL

As I understand, Cassandra doesn't have master node. Thus I can write data to any random node as I want. Let's say I have 03 nodes A, B and C. I write to node A record 123, value is 4.

Question 1: Will execute() method in Session object be blocked until the data has been replicated on all replicas ?

Another situation: Let's say the record 123 with value of 5 is also written to node B, 100 millisecond after the request for inserting record 123 with value of 4 arrived at node A.

Question 2: When B is a replica of A, how can cassandra handles this situation in its architecture? Will cassandra node use their internal time to decide which node received the record first? Or all replicas will share the same lock for writing data?

Question 3: When B is not a replica of A, and I have read consistency is set to ALL. If I query for the value of record 123 randomly on node A or B, how can Cassandra handle this situation ?

I'm new to Cassandra thus any answer or help is highly appreciated.

Thank you very much.

1

1 Answers

3
votes

Will execute() method in Session object be blocked until the data has been replicated on all replicas ?

The session object will be blocked until N acknowledgements of your mutation(s) are received, N depends on the chosen consistency level. In your case, since you're using ALL, the client will block until acknowledgements are received from all replicas.

When B is a replica of A, how can cassandra handles this situation in its architecture? Will cassandra node use their internal time to decide which node received the record first? Or all replicas will share the same lock for writing data?

The coordinator node (the one which receives the request) will dispatch the write, in parallel, to all replicas. With modern drivers like the Java driver, most of the time the coordinator node is chosen so that it is a replica for the partition being inserted, to avoid one extra network hop.

The role of the coordinator is also to set a timestamp value on each column of your write. This timestamp is the same and will be sent to all replicas

When B is not a replica of A, and I have read consistency is set to ALL. If I query for the value of record 123 randomly on node A or B, how can Cassandra handle this situation ?

In this case, the node which receives the request, called coordinator, will act as a proxy by forwarding the request to the appropriate replica(s) and by forwarding the response(s) it receives back to the client.

Each node knows about the topology of the whole cluster (token range, IP address) so that each node can play the role of a coordinator at any time.

More details about how the data distribution is handled in Cassandra here: http://www.slideshare.net/doanduyhai/cassandra-introduction-apache-con-2014-budapest/18