I have a four node, two Data Center cassandra 1.1.1 cluster. My keyspace is RF 2 per Data center, giving me complete copy of data on each node. The cluster is for a vendor product, which uses r/w consistency of QUORUM. With this config I can only handle the loss of one node.... How can I tweak it to handle the loss of a data center?
1 Answers
Unless your data centers are in the same physical location, your network overhead is going to be terrible with this configuration. The reason is because quorum consistency will not pay attention to DC when it's comparing replicas. So you will frequently have to cross data center lines before acking a read or write. Switching to local quorum would solve the latency issue but would effectively cause a data center to go down if one node goes down. However, as long as both nodes in the second DC are up (and your app can handle this properly), you will still be up and running.
Having said that, the general rule of thumb is that 3 nodes is the bare minimum per data center. If you add a node to each data center and switch to local quorum R/W, you can lose one node in each DC and still have that DC operational, or you can lose an entire DC with the other remaining operational.