Problem: MongoDB writes fail with the error -
Timed out after 30000 ms while waiting for a server that matches PrimaryServerSelector. Client view of cluster state is {type=REPLICA_SET, servers=[{address=intdb01:27017, type=REPLICA_SET_SECONDARY, roundTripTime=0.7 ms, state=CONNECTED}]
This happens when the primary has switched from intdb01 to intdb02. Looks like the client driver is still looking for intdb01 to be primary node.
Our Setup
We have 3 mongoDb nodes in a replicaSet called rs0. When we connect to it using Java, we give all 3 servers in the connection string as follows: mongodb://intdb01:27017,intdb02:27017,intdb03:27017/?replicaSet=rs0
db.version --> 3.0.4
Java Driver Version: mongodb-driver-3.0.4.jar, mongodb-driver-core-3.0.4.jar
Client Connection code:
if( mongoClient == null) {
MongoClientURI mcu = new MongoClientURI(mongoConnect);
mongoClient = new MongoClient(mcu);
}
mongoConnect contains the connection string shown above.
Mongo Replicaset Status Info
> rs.status()
{
"set" : "rs0",
"date" : ISODate("2016-07-19T21:14:03.001Z"),
"myState" : 1,
"members" : [{
"_id" : 0,
"name" : "intdb01",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 439786,
"optime" : Timestamp(1468521291, 5),
"optimeDate" : ISODate("2016-07-14T18:34:51Z"),
"lastHeartbeat" : ISODate("2016-07-19T21:14:01.877Z"),
"lastHeartbeatRecv" : ISODate("2016-07-19T21:14:01.611Z"),
"pingMs" : 0,
"configVersion" : 4
},
{
"_id" : 1,
"name" : "intdb02:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 2948844,
"optime" : Timestamp(1468521291, 5),
"optimeDate" : ISODate("2016-07-14T18:34:51Z"),
"electionTime" : Timestamp(1468523057, 1),
"electionDate" : ISODate("2016-07-14T19:04:17Z"),
"configVersion" : 4,
"self" : true
},
{
"_id" : 2,
"name" : "intdb03:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 439779,
"optime" : Timestamp(1468521291, 5),
"optimeDate" : ISODate("2016-07-14T18:34:51Z"),
"lastHeartbeat" : ISODate("2016-07-19T21:14:01.294Z"),
"lastHeartbeatRecv" : ISODate("2016-07-19T21:14:01.294Z"),
"pingMs" : 0,
"configVersion" : 4
}
],
"ok" : 1
}
I have checked and rechecked the code/config with the documentation but cannot find what is amiss. We can simulate this by stepping down the primary. The Secondary takes over but the app writes start failing. :-(
Any pointers/tips much appreciated!
Adding Stack trace-
Caused by: com.mongodb.MongoTimeoutException: Timed out after 30000 ms while waiting for a server that matches PrimaryServerSelector. Client view of cluster state is {type=REPLICA_SET, servers=[{address=intdb01:27017, type=REPLICA_SET_SECONDARY, roundTripTime=0.7 ms, state=CONNECTED}]
at com.mongodb.connection.BaseCluster.createTimeoutException(BaseCluster.java:370)
at com.mongodb.connection.BaseCluster.selectServer(BaseCluster.java:101)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:75)
at com.mongodb.binding.ClusterBinding$ClusterBindingConnectionSource.<init>(ClusterBinding.java:71)
at com.mongodb.binding.ClusterBinding.getWriteConnectionSource(ClusterBinding.java:68)
at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:175)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:141)
at com.mongodb.operation.MixedBulkWriteOperation.execute(MixedBulkWriteOperation.java:72)
at com.mongodb.Mongo.execute(Mongo.java:747)
at com.mongodb.Mongo$2.execute(Mongo.java:730)
at com.mongodb.MongoCollectionImpl.executeSingleWriteRequest(MongoCollectionImpl.java:482)
at com.mongodb.MongoCollectionImpl.update(MongoCollectionImpl.java:474)
at com.mongodb.MongoCollectionImpl.updateOne(MongoCollectionImpl.java:325)
at com.grid.core.persistence.mongodb.dao.GridEventDao.save(GridEventDao.java:69)
... 6 more
2016-07-19 17:05:01,882 [pool-2-thread-2] INFO org.mongodb.driver.cluster - No server chosen by PrimaryServerSelector from cluster description ClusterDescription{type=REPLICA_SET, connectionMode=SINGLE, all=[ServerDescription{address=intdb01:27017, type=REPLICA_SET_SECONDARY, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 0, 4]}, minWireVersion=0, maxWireVersion=3, electionId=null, maxDocumentSize=16777216, roundTripTimeNanos=724681, setName='rs0', canonicalAddress=intdb01:27017, hosts=[intdb01:27017, intdb02:27017], passives=[intdb03:27017], arbiters=[], primary='intdb02:27017', tagSet=TagSet{[]}}]}. Waiting for 30000 ms before timing out
I am trying to figure out if it is a coding error or configuration. The last info Message in the stack trace, says connectionMode is SINGLE. Have a suspicion that it is causing this issue but I cannot find any info about it in the developer documentation site.