We have a Solr core that has about 250 TrieIntField
s (declared as dynamicField
). There are about 14M docs in our Solr index and many documents have some value in many of these fields. We have a need to sort on all of these 250 fields over a period of time.
The issue we are facing is that the underlying lucene fieldCache
gets filled up very quickly. We have a 4 GB box and the index size is 18 GB. After a sort on 40 or 45 of these dynamic fields, the memory consumption is about 90% and we start getting OutOfMemory errors.
For now, we have a cron job running every minute restarting tomcat if the total memory consumed is more than 80%.
From what I have read, I understand that restricting the number of distinct values on sortable Solr fields will bring down the fieldCache
space. The values in these sortable fields can be any integer from 0 to 33000 and quite widely distributed. We have a few scaling solutions in mind, but what is the best way to handle this whole issue?
UPDATE: We thought instead of sorting, if we did boosting it won't go to fieldCache. So instead of issuing a query like
select?q=name:alba&sort=relevance_11 desc
we tried
select?q={!boost relevance_11}name:alba
but unfortunately boosting also populates the field cache :(