I’m referring to the Couchbase Server in the application stack section of this document, outlining the desired architecture of a Couchbase Cluster.
I notice that each of the 5 Couchbase nodes in the diagram have a corresponding web server. I am also aware that Couchbase SDKs are designed to establish a connection to a single node, and retain that connection for all requests, with the exception of failover events.
Firstly, I want to confirm that each of the 5 web servers in the diagram will establish a single connection to one of the 5 Couchbase nodes. I assume that a 1:1 relationship will result; each web server will connect to a corresponding Couchbase node, such that no 2 web servers will establish connections to the same Couchbase node.
If this is the case, then in the event of Couchbase node-failure, assuming that the node is unrecoverable, should I remove the corresponding web server? This may seem unintuitive, but here is the situation as I understand it:
- Couchbase node #1 dies
- Web server #1 (connected to Couchbase node #1) establishes a connection to the next available node, Couchbase node #2 (most SDKs handle this, FAIA)
- Couchbase node #2 now has 2 established connections; from web server #2 (its corresponding server) and also now from web server #1 (whose corresponding Couchbase node is dead)
My concern is that I have noticed ephemeral port exhaustion issues with Couchbase Server, when establishing more than 1 connection to a single node. This generally results in client timeouts:
Get http://0.0.0.0:8091/pools: dial tcp 0.0.0.0:8091: operation timed out
Again, based on this, should I also remove the corresponding web server when a Couchbase node dies, to avoid multiple connections to the same Couchbase node, and potential ephemeral port exhaustion?