2
votes

I would like understand how to decide the load balancing policy for a heavy batch workload in cassandra using datastax java driver. I have two datacenters and I would like write into cluster with consistency ONE as quick as possible reliably to the extent possible.

How do I go about choosing the load balancing options, I see TokenAwarePolicy, LatencyAware, DCAware . Can I just use all of them?

Thanks Srivatsan

2

2 Answers

1
votes

If you are willing to use consistency level ONE, you do not care which data centre is used, so there is no need to use DCAwareRoundRobinPolicy. If you want the write to be as quick as possible, you want to minimise the latency, so you ought to use LatencyAwarePolicy; in practice this will normally select a node at the local data centre, but will use a remote node if it is likely to provide better performance, such as when a local node is overloaded. You also want to minimize the number of network hops, so you want to use one of the storage nodes for the write as the coordinator for the write, so you should use TokenAwarePolicy. You can chain policies together by passing one to the constructor call of another.

Unfortunately, the Cassandra driver does not provide any directly useful base policy for you to use as the child policy of LatencyAwarePolicy or TokenAwarePolicy; the choices are DCAwareRoundRobinPolicy, RoundRobinPolicy and WhiteListPolicy. However, if you use RoundRobinPolicy as the child policy, the LatencyAwarePolicy should, after the first few queries, acquire the latency information it needs.

3
votes

The default LoadBalancingPolicy in the java-driver should be perfect for this scenario. The default LoadBalancingPolicy is defined as (from Policies):

public static LoadBalancingPolicy defaultLoadBalancingPolicy() {
    return new TokenAwarePolicy(new DCAwareRoundRobinPolicy());
}

This will keep all requests local to the datacenter that the contact points you provide are in and will direct your requests to replicas (using round robin to balance) that have the data you are reading/inserting.

You can nest LoadBalancingPolicies, so if you would like to use all three of these policies you can simply do:

LoadBalancingPolicy policy = LatencyAwarePolicy
  .builder(new TokenAwarePolicy(new DCAwareRoundRobinPolicy()))
  .build();