Our architecture is SOLRCloud 4.4 with 1 collection and several shards and replices.
Lately on some of the documents we received the following exception:
org.apache.solr.common.SolrException: No active slice servicing hash code 7b50d0a2 in DocCollection(collection1)={
"shards":{
"shard1":{
"range":"80000000-d554ffff",
"state":"active",
"replicas":{
"core_node1":{
"state":"active",
"core":"collection1",
"node_name":"XX.XXX.XXX.131:8983_solr",
"base_url":"http://XX.XXX.XXX.131:8983/solr",
"leader":"true"},
"core_node7":{
"state":"active",
"core":"collection1",
"node_name":"XX.XXX.XXX.131:9983_solr",
"base_url":"http://XX.XXX.XXX.131:9983/solr"}}},
"shard2":{
"range":"d5550000-2aa9ffff",
"state":"active",
"replicas":{
"core_node5":{
"state":"active",
"core":"collection1",
"node_name":"XX.XXX.XXX.133:8983_solr",
"base_url":"http://XX.XXX.XXX.133:8983/solr"},
"core_node8":{
"state":"active",
"core":"collection1",
"node_name":"XX.XXX.XXX.132:8983_solr",
"base_url":"http://XX.XXX.XXX.132:8983/solr",
"leader":"true"}}},
"shard3":{
"range":null,
"state":"active",
"replicas":{
"core_node6":{
"state":"active",
"core":"collection1",
"node_name":"XX.XXX.XXX.133:9983_solr",
"base_url":"http://XX.XXX.XXX.133:9983/solr"},
"core_node9":{
"state":"active",
"core":"collection1",
"node_name":"XX.XXX.XXX.132:9983_solr",
"base_url":"http://XX.XXX.XXX.132:9983/solr",
"leader":"true"}}}},
"router":"compositeId"}
From reading about Solr and Zookeeper, I understand that the zookeeper was trying to index a document on a shard that was in a fault state ? therefor it failed ? but when i look at the status via web-browser, all the shards are online with valid state.