2
votes

I have a two node setup in Azure and I am trying to get failover working when connecting with the C# driver. My nodes seem to be communicating fine when working with cqlsh and within OpsCenter.

var contact = "publicipforfirstnode";
_cluster = Cassandra.Cluster.Builder().AddContactPoint(contact).Build();
_session = _cluster.Connect("demo");

I initially connect with the public IP of the first node. This works fine. However in the configuration I use the internal network IPs assigned by my virtual network such as 10.1.0.4, 10.1.0.5, etc. I set them as the listen_address and broadcast_rpc_address for each node. Even though I use the internal IP in the configuration I can connect with the public IP just fine. I have a special firewall rule that allows me to connect from a certain machine on the public IP. However to avoid firewall rules for inner-node communication, I put the nodes on the same virtual network and no extra work is required.

This seems great until my first node goes down. It then tries the second node using the internal IP.

I get an error: All Hosts tried for query (Public IP of First Node), (Internal IP of Second Node)

But since I am connecting from a machine not in the virtual network it can't reach this internal ip. My application won't be in the internal network so this seems like an issue.

Not using internal ips forces me to setup authentication and/or special firewall rules I'd rather not have to do. Is there any way to force the c# driver to use public ips and allow the nodes to communicate on internal ips? Using internal ips seems to be the recommended best practice unless you have multiple regions.

2

2 Answers

2
votes

The IP configured as broadcast_rpc_address in the cassandra.yaml file is used by the drivers to connect to them.

In your case, if you want to connect with the driver using the public ip addresses, you should set the broadcast_rpc_address as the public IP address.

You can enable tracing in the driver to see what is happening under the hood:

// Specify the minimum trace level you want to see
Cassandra.Diagnostics.CassandraTraceSwitch.Level = TraceLevel.Info;
// Add a standard .NET trace listener
Trace.Listeners.Add(new ConsoleTraceListener());

From the docs:

  • listen_address: The IP address or hostname that Cassandra binds to for connecting to other Cassandra nodes.
  • broadcast_rpc_address: RPC address to broadcast to drivers and other Cassandra nodes. This cannot be set to 0.0.0.0. If blank, it is set to the value of the rpc_address or rpc_interface. If rpc_address or rpc_interfaceis set to 0.0.0.0, this property must be set.
1
votes

I think it is important to understand what broadcast_address and broadcast_rpc_address mean when your Cassandra cluster is behind a NAT device (like a firewall or gateway).

broadcast_address is the address that other nodes connect to. By default, this is the same as listen_address(usually you want this because the nodes are in the same network).

In the case where your cluster is across two networks and NAT takes place, you must set it to a value that nodes on both networks can access (like a public IP if you do multi-region deployment in AWS). This means that traffic within a network, as well as across networks, will go through the NAT device, because the internal IP is not reachable.

broadcast_rpc_address is the address that a node "advertizes" about another node.

E.g., node A has broadcast_address=10.0.0.100 and broadcast_rpc_address=52.2.3.100, node B has broadcast_address=10.0.0.101 and broadcast_rpc_address=52.2.3.101

What then happens is that node A will connect to node B on 10.0.0.101, but if a client driver asks A "hey, what other nodes are in your cluster?", then it will respond 52.2.3.101 for B.

This design (introduced in Cassandra 2.0.10 I believe) makes it possible for clients outside the network to connect to any node in the cluster (not just the seed nodes).

But a limitation is that you can't have clients both inside and outside the network, otherwise you need to make sure the public IP is reachable both inside and outside the network (like changing firewall settings).

I hope this clarifies things a bit.

Addition

If you're keen, you can find what a node knows about other nodes with the following cqlsh commands:

select * from system.peers

The peer column is the broadcast_address of the node, rpc_address column is the broadcast_rpc_address of the node.