I am running Solr 7.6 with nine replicas and one shard.
When we run our full indexing, few of our nodes go to recovery mode and stuck in the recovery state forever.
We have a total of 90k parent docs, and each parent doc has 300 children.
parent doc size: 15kB
child doc size: 500B
total time of full indexing: 36-39 mins
batch size: max 1000(parent docs which include 300 children each) = 1000*300
The number of threads used for full indexing: 10
Average total docs indexed/second: 2400 Parent docs * 300 children
commit setting:
autosoftcommit maxtime: 30s
autocommit maxtime: 1min
numRecordsToKeep: 100
Each of the ten threads fetches the data from Cassandra and creates the document for indexing, once the thread has 1000 parent docs(with 300 children) ready for indexing in its buffer list, it pushes the data to Solr using update API.
With the above settings, 2-3 nodes go to recovery state when I run the full indexing job.
I have a few questions:
- What would be the number of records that I can index/second for single shard solr cluster, with my document size?
- Do I need to reduce the number of threads? or the batch size?