We are facing a problem we did not expect in Cassandra we have a cluster of 6 nodes split into two Datacenters. (See the image1 bellow) http://s9.postimg.org/vyiykbosf/Cassandra_normal.png Unfortunatly we faced a problem recently, we lost 3 nodes (see images2 bellow) and we where not able to have the cluster fully available. http://postimg.org/image/yy3o6w10r/
On each datacenter we have a read consistency of ONE and a WRITE consistency of LOCAL_QUORUM. The thing is that we lost two nodes on the same datacenter and when the coordinator was set to the only available node in this Datacenter the consistency LOCAL_QUORUM wasn't satisfied when there was a write.
We know there is the onWriteTimeout method but we do not want to lower the consistency level. Therefore, is it possible to switch the coordinator when the LOCAL_QUORUM is not possible ?(ie : When coordinator is on DataCenter II, the write is not possible then a retry switch the coordinator to an available node on Datacenter I)
We found the Class DCAwareRoundRobinPolicy, but I'm not sure how it really works and If it will fit to our need. Do you guys know how the host on the remote datacenter is choose ? Where is set the list of those hosts ?
Regards,