Titan using Cassandra - Multiple Datacenter Oddities

Question

Say I have 2 datacenters - DC1 and DC2. DC1 has 3 nodes with replication 3 (fully replicated) and DC2 has 1 node with replication 1 (fully replicated).

Say the lone node in DC2 is up, all nodes in DC1 are down, and my read/write consistency is at LOCAL_QUORUM everywhere.

I try to do a transaction on DC2 but it fails due to UnavailableException, which of course means not enough nodes are online. But why? Does the LOCAL part of LOCAL_QUORUM get ignored because I only have one node in that data center?

The lone node in DC2 has 100% of the data so why can't I do anything unless 2 nodes are also up in DC1, regardless of read/write consistency settings?

Sreekar Sreekar · Accepted Answer · 2016-09-08T20:53:42

With your settings, 2 replicas need to be written to disk for a write to succeed. Here the failed write(row) partition might belong to the down nodes. Because the hash of that partition decides where it needs to go. Once you decommision those nodes, ring gets re-adjusted and work fine.

But as long as they are simply down, some writes will succeed and some writes will fail. You can check which write succeeds and which one fails by simply checking the hash and ring tokens

eg: Now imagine we got a request for that node with token range 41-50. And according to replication strategy, the next replica should go to 1-20 and 11-20, then LOCAL_QUORAM is not satisfied because they are down. So your write fails.

Titan using Cassandra - Multiple Datacenter Oddities

2 Answers