0
votes

I am working on making some improvements on reindexing process. So we have our custom logic to figure out which documents have been modified and need to be reindexed. So at the end I can generate a delete query with something like delete all documents where fieldId in list

So instead of deleting and adding 50k documents everytime we only re-index a tiny percentage of it.

Now I am thinking about edge case scenario where our list of fieldIds is extremely large say 30-40,000 ids so if that's the case is there a upper limit on request length that I should worry about, or would it in turn cause negative effects on performance and exacerbate the situation instead of making it better. I read some articles on google where they are advising to make it a post request instead.
I am using SolrNet latest build which is build on Solr 4.0

1
Yes, use a POST request.Mauricio Scheffer

1 Answers

0
votes

I would revisit that logic because deleting the documents then re-index them again is not the best solution. Because firstly it is an expensive operation, secondly your index will be empty or in-complete for a while until you re-index the documents again, which means if you query your index in the middle of the operation you could get zero, or partial results.

I would advise to just index again with the same document Id (uniquekey defined in solr schema.xml). And solr is smart eough to overwrite the document if it is indexed with the same Id. Then you don't have to worry about the hassle of deleting old documents. You might also do 'Optimize' to the index from time to time to physically get rid of 'deleted' documents.