I've had this week an issue with a Solr index: http://lucene.472066.n3.nabble.com/corrupted-index-in-slave-td4054769.html,
Today, that error started to happen constantly for almost every request, and I created a JIRA issue becaue I thought it was a bug https://issues.apache.org/jira/browse/SOLR-4707
As you can read, at the end it was due to a fail in the Solr master-slave replication, and now I don't know if we should think about migrating to SolrCloud, since Solr master-slave replications seems not to fit to our requirements:
- index size: ~20 million documents, ~9GB
- ~1200 updates/min
- ~10000 queries/min (distributed over 2 slaves) MoreLikeThis, RealTimeGet, TermVectorComponent, SearchHandler
I would thank you if anyone could help me to answer these questions:
- Would it be advisable to migrate to SolrCloud? Would it have impact on the replication performance?
- In that case, what would have better performance? to maintain a copy of the index in every server, or to use shard servers?
- How many shards and replicas would you advice for ensuring high availability?
Kind Regards,
Victor