There was a query that I had regarding the size of Solr data backup. We take Solr backups once a day. We could observed that the size of Solr backup was reduced by 1 GB from that of the previous day, but there had been no deletions or updations made on Solr that day. We checked the number of documents also for both the days. It was more for the backup with lesser size. Is it because of any optimization that Solr is doing internally?
0
votes
1 Answers
1
votes
Deleted documents (and remember that an update is a delete + an add internally) are not removed before optimize
is called on the index or the mergeFactor is hit. This causes the index files to be rewritten to disk, and any deleted content is expunged.
After the index files have been rewritten, the old files are removed and the new index files does not contain the old, deleted documents.