Spring data elasticsearch - migration documents to new index

Question

I'm developing spring application for search purpose. I use elasticsearch spring data library for creating indices, and managing documents. For querying (searching) I used regular client from elasticsearch - not from spring data.

I noticed that the spring data only creates index if it is missing in the elasticsearch. Whenever new field is added to the the class annotated with @Document, mapping will not be updated. Thus, searching in just-added field cause a bad request.

The application works now on production already. There are multiple instances of this application running. I would like to change the mapping of the index and keep existing data.

The solution I found in the internet and in the documentation is to create new index, copy data (and possibly change them on-the-fly) with reindex functionality and switch aliases to the new one.

I implemented solution with this approach. Migration procedure runs on application startup(if required - decided with env param). However, this approach seems to me to be cheap and shoddy. Changing documents with painless script is error prone. It is difficult to test migration. I need to manually keep information on which env I am running migration, and have proper index name set. During deployment I need to keep an eye on the proces to check if everything worked correctly. Possibly some manual changes would be required as well. What if reindex procedure fails in the meantime?

There are a lot of questions that are bothering me. I was searching why there isn't library, similar to Flyway. Also, I understand that it is no possible to change mapping of the index, but it is possible to add new field and this is not supported in the the spring data elasticsearch.

Could you guys please give me some advices how do you tackle such situations?

P.J.Meisch P.J.Meisch · Accepted Answer · 2021-04-06T18:33:35

This is no answer as how to generally do these migrations, but some clarification of what Spring Data Elasticsearch can do and what it does.

Spring Data Elasticsearch creates an index with the corresponding mappping if you are using a Spring Data Elasticsearch repository for your entity and if the index does not exist on application startup. It does not update the mapping of an index by itself.

You can nevertheless update an index mapping from the program code, there's IndexOperations.putMapping(java.lang.Class<?>) for that. So if you add a new property to your entity and then on application start call this method with the changed entity class, the index mapping will be updated. This can only add new fields to the mapping, not change existing ones - this is a restriction of Elasticsearch.

If your application is running in multiple instances it is up to you to synchronize them in updateing or in correctly handling errors.

If you add fields make sure to update the mapping before adding data, otherwise the new field type will be autodetected by Elasticsearch and you will have to do a manual reindex process.

Spring data elasticsearch - migration documents to new index

1 Answers