3
votes

I'm new in Apache Lucene/Solr.

I try to move from Elasticsearch to Apache Solr.

So, I have a question about following index data location configuration.

in Elasticsearch

# Can optionally include more than one lo # the locations (a la RAID 0) on a file l # space on creation. For example:
#
#path.data: /path/to/data1,/path/to/data2

in Apache Solr

<dataDir>/var/data/solr/</dataDir>

I want to configure multiple index data directory like Elasticsearch in Apache Solr.

Is it possible?

How I can reach the goal?

Is it possible multiple index data directory in Apache Solr?

Is it possible multiple index data directory in Apache Solr?

2
what made you move from elasticsearch to solr? just curious - Emad
Restarting time of elasticsearch instance is very slow (several hours!) in our environment. We don't solve this problem so, we try to find alternative solution like solr... - Sungshik Jou

2 Answers

0
votes

How I can reach the goal?

That depends on the reason, why you need multiple index directories. By default, solr does not support multiple of index-locations using the <dataDir>.

So the question is: why you need that?

For high availability (for the case, if one storage/index-path is not avalaible?)? Or for performance issue? To spread the disk I/O over many drives?

In this case, there are some other solr features/products you should use, like SolrCloud, distributed search

0
votes

There does not seem to be a way to configure this with Solr at this time (Sept 2020).

I agree it would be advantageous from a performance perspective when a host has multiple volumes available, to spread the data like you would with Cassandra, Elasticsearch, etc.

An alternative is to run more than one instance on a host, but that has many other drawbacks.

Alternatively you would have to use OS level tools like LVM in linux to create a volume that is spread across existing drives or file systems.

Because my file systems were pre-existing, I had to use dd to create sparse files, and lvm to create a logical block device that wrapped across those files. This was not the most efficient thing to do, but worked.