I just want to understand query flow and how load balancing works in case of LBHttpSolrServer. We have setup SolrCloud with one collection, and that collection has 4 shards and each shard has two nodes i.e one master and one replica.
I have configured LBHttpSolrServer as below.
SolrServer lbHttpSolrServer = new LBHttpSolrServer("shard1_master:8080/solr/","shard2_master:8080/solr/","shard3_master:8080/solr/","shard4_master:8080/solr/","shard1_replica:8080/solr/","shard2_replica:8080/solr/","shard3_replica:8080/solr/","shard4_replica:8080/solr/",);
From my understanding solr and solrj works as below,
- LBHttpSolrServer keeps pinging above list of servers and maintains list of live servers.
- Every time query arives it picks one server from the list (round-robin fashion)
- Sends query to selected server server.
- When query arives at solr node it internally distributes query to remaining shards , collects,merges,ranks results and sends response back to the user.
Here my confusion is at point number 4, is my understanding correct? if not please correct. And do i need to pass all 8 nodes to LBHttpSolrServer or just 4 will be sufficient .