Brief overview of the setup:
5 x SolrCloud (Solr 4.6.1) node instances (separate machines).
The setup is intended to store last 48 hours webapp logs (which are pretty intense... ~ 3MB/sec)
"logs" collection has 5 shards (one per node instance).
One logline represents one document of "logs" collection
If I keep storing log documents to this "logs" collection, cores on shards start getting really big and CPU graphs show that instances spend more and more time waiting for disk I/O.
So, my idea is to create new collection with each 15 minutes and name it "logs-201402051400" with shards spread across 5 instances. Document writers will start writing to the new collection as soon as it is created. At some time I will get the list of collection like that:
...
logs-201402051400
logs-201402051415
logs-201402051430
logs-201402051445
logs-201402051500
...
Since there will be max 192 collections (~1000 cores) in the SolrCloud at some certain period of time. It seems that search performance should degrade drastically.
So, I would like to merge collections that are not being currently written to into one large collection (but still sharded across 5 instances). I have found information how to merge cores, but how can I merge collections?