Reindexing Data with Elasticsearch

Sooner or later, you’ll run into a problem of reindexing the data of your Elasticsearch instances. When we do Elasticsearch consulting for clients we always look at whether they have some way to efficiently reindex previously indexed data. The reasons for reindexing vary – from data type changes, analysis changes, to introduction of new fields that that need to be populated. No matter the case, you may either reindex from your source of truth or treat your Elasticsearch instance as such. Up to Elasticsearch 2.3 we had to use external tools to help us with this operation, like Logstash or stream2es. We even wrote about how to approach reindexing of data with Logstash. However, today we would like to look at the new functionality that will be added to Elasticsearch 2.3 – the re-index API.

The pre-requisites are quite low – you only need Elasticsearch 2.3 (not yet officially released as of this writing) and you need to be able to run a command on it. And that’s it, nothing more is needed and Elasticsearch will do the rest for us.