I am experimenting with Datestax OpsCenter 5.2 and Cassandra 2.1.7. One trouble I encountered is that OpsCenter daemon (i.e., the server) seems to try to connect to the Cassandra agents using the broadcast_rpc_address
, which is blocked by the security group (because broadcast_rpc_address
is a public IP on AWS).
Details
The cluster has three nodes (10.0.0.0/24 is the subnet of a VPC on AWS, 52.x.x.x is a public IP)
Node0
cassandra.yaml: broadcast_address=10.0.0.100, rpc_address=10.0.0.100, broadcast_rpc_address=52.2.3.100
address.yaml: stomp_interface=10.0.0.99, local_interface=10.0.0.100, agent_rpc_broadcast_address=10.0.0.100
Node1
cassandra.yaml: broadcast_address=10.0.0.101, rpc_address=10.0.0.101, broadcast_rpc_address=52.2.3.101
address.yaml: stomp_interface=10.0.0.99, local_interface=10.0.0.101, agent_rpc_broadcast_address=10.0.0.101
Node2
cassandra.yaml: broadcast_address=10.0.0.102, rpc_address=10.0.0.102, broadcast_rpc_address=52.2.3.102
address.yaml: stomp_interface=10.0.0.99, local_interface=10.0.0.102, agent_rpc_broadcast_address=10.0.0.102
OpsCenter Node
Deployed in the same subnet
ip=10.0.0.99
Symptons
After adding "10.0.0.100, 10.0.0.101, 10.0.0.102" to the "Add Cluster" window on OpsCenter web console, I got the following in opscenterd.log
:
2015-09-04 11:05:38+0000 [] INFO: New Cassandra host 52.2.3.100 discovered
2015-09-04 11:05:38+0000 [] INFO: New Cassandra host 52.2.3.101 discovered
...
2015-09-04 11:05:43+0000 [] WARN: [control connection] Error connecting to 52.2.3.100: errors=Timed out creating connection, last_host=None
2015-09-04 11:05:43+0000 [] ERROR: Control connection failed to connect, shutting down Cluster: ('Unable to connect to any servers', {'52.2.3.100': OperationTimedOut('errors=Timed out creating connection, last_host=None',)})
Notice OpsCenter tries to connect to nodes via their broadcast_rpc_address
, which is blocked by the security group. This is despite I have set agent_rpc_broadcast_address
to subnet IPs.
Question 1
Is this the correct behavior of OpsCenter? Why agent_rpc_broadcast_address
is not used?
Question 2
If I change broadcast_rpc_address
to subnet IPs, then OpsCenter connects fine. But this prevents my clients from connecting, because non-seed nodes will have their subnet IP reported by seed nodes to the client, which is not reachable by the client.
I can also open up the security group to the OpsCenter server, but this is risky and requires going through the gateway.
So how should I solve the problem in this case?
Thoughts
The core of this problem is how to "intelligently" decide which IP to connect to depending on whether a client is inside or outside a subnet. All documentation I have seen does not make it clear how this works.
Thanks for any help.
Addition 1
Would be grateful if you could also clarify how rpc(thrift) and native(binary) protocol are used by client and OpsCenter.
I have the impression that rpc is deprecated in favor of native protocol, but will this affect inter-node and client-node connection?