2
votes

Using the following query when I create the collection I set two shards for the collection10.

/solr/admin/collections?action=CREATE&name=collection10&numShards=2&replicationFactor=2

But what is my requirement is, I have to add 3rd shard dynamically after 10000 documents has been indexed in first two shards.

Is it possible to add shards dynamically once we started the collection and indexing at existing shards? If it possible means how to add shards dynamically once after we started the collection?

And also, is it possible to add replicas dynamically once we started the collection.For example I set replicationFactor=2, then later I need to start the new replication for the already started collection. Is it possible to do? If it so, how to do it?

3

3 Answers

9
votes

One solution to the problem is to use the "implicit router" when creating your Collection.

Solr does supports the ability to add New Shards (or DELETE existing shards) to your index (whenever you want) via the "implicit router" configuration (CREATE COLLECTION API).

Lets say - you have to index all "Audit Trail" data of your application into Solr. New Data gets added every day. You might most probably want to shard by year.

You could do something like the below during the initial setup of your collection:

admin/collections?
action=CREATE&
name=AuditTrailIndex&
router.name=implicit&
shards=2010,2011,2012,2013,2014&
router.field=year

The above command: a) Creates 5 shards - one each for the current and the last 4 years 2010,2011,2012,2013,2014 b) Routes data to the correct shard based on the value of the "year" field (specified as router.field)

In December 2014, you might add a new shard in preparation for 2015 using the CREATESHARD API (part of the Collections API) - Do something like:

/admin/collections?
action=CREATESHARD&
shard=2015&
collection=AuditTrailIndex

The above command creates a new shard on the same collection.

When its 2015, all data will get automatically indexed into the "2015" shard assuming your data has the "year" field populated correctly to 2015.

In 2015, if you think you don't need the 2010 shard (based on your data retention requirements) - you could always use the DELETESHARD API to do so:

/admin/collections?
action=DELETESHARD&
shard=2015&
collection=AuditTrailIndex

P.S. This solution only works if you used the "implicit router" when creating your collection. Does NOT work when you use the default "compositeId router" - i.e. collections created with the numshards parameter.

This feature is truly a gamechanger - allows shards to be added dynamically based on growing demands of your business.

Is this feature available in Elastic Search. If not, I am sure they will in time.

0
votes

Currently you cannot add new shards once the collection is made

https://issues.apache.org/jira/browse/SOLR-3755