I'm designing a multi-tenant SaaS application where tenants will be able to store data and perform search on it. I plan to use Lucene (actually, Lucene.Net) as the search engine. As cross-tenant searches are not required, I am considering having one index (so one directory) per tenant.
I don't expect the index writes to be insanely frequent, so they will be queued to a single process that will open the index, add the doc and close the index as updates arrive.
I would like to have something more efficient on the reads, though. The number of tenants may scale from hundreds to tens of thousands, so keeping all directories open in RAM on each search node is not sensible. I am thinking about managing a shortlist of recently used or maybe most frequently used directories, regularly closing those that fall outside of the criteria.
I'm really new to Lucene in general so would appreciate some feedback on the strategy.
Thanks