After creating an index, you might need to make an extensive change such as adding a new field to every document or combining multiple indices to form a new one. Rather than deleting your index, making the change offline, and then indexing your data all over again, you can use the `reindex` operation.
With the `reindex` operation, you can copy all or a subset of documents that you select through a query to another index. Reindex is a `POST` operation. In its most basic form, you specify a source index and a destination index.
Reindexing can be an expensive operation depending on the size of your source index. We recommend you disable replicas in your destination index by setting `number_of_replicas` to `0` and re-enable them once the reindex process is complete.
{: .note }
---
#### Table of contents
1. TOC
{:toc}
---
## Reindex all documents
You can copy all documents from one index to another.
You first need to create a destination index with your desired field mappings and settings or you can copy the ones from your source index:
```json
PUT destination
{
"mappings":{
"Add in your desired mappings"
},
"settings":{
"Add in your desired settings"
}
}
```
This `reindex` command copies all the documents from a source index to a destination index:
```json
POST _reindex
{
"source":{
"index":"source"
},
"dest":{
"index":"destination"
}
}
```
If the destination index is not already created, the `reindex` operation creates a new destination index with default configurations.
## Reindex from a remote cluster
You can copy documents from an index in a remote cluster. Use the `remote` option to specify the remote hostname and the required login credentials.
This command reaches out to a remote cluster, logs in with the username and password, and copies all the documents from the source index in that remote cluster to the destination index in your local cluster:
This command copies only a subset of documents matched by a query operation to the destination index:
```json
POST _reindex
{
"source":{
"index":"source",
"query": {
"match": {
"field_name": "text"
}
}
},
"dest":{
"index":"destination"
}
}
```
For a list of all query operations, see [Full-text queries](../full-text/).
## Combine one or more indices
You can combine documents from one or more indices by adding the source indices as a list.
This command copies all documents from two source indices to one destination index:
```json
POST _reindex
{
"source":{
"index":[
"source_1",
"source_2"
]
},
"dest":{
"index":"destination"
}
}
```
Make sure the number of shards for your source and destination indices are the same.
## Reindex only unique documents
You can copy only documents missing from a destination index by setting the `op_type` option to `create`.
In this case, if a document with the same ID already exists, the operation ignores the one from the source index.
To ignore all version conflicts of documents, set the `conflicts` option to `proceed`.
```json
POST _reindex
{
"conflicts":"proceed",
"source":{
"index":"source"
},
"dest":{
"index":"destination",
"op_type":"create"
}
}
```
## Reindex sorted documents
You can copy certain documents after sorting specific fields in the document.
This command copies the last 10 documents based on the `timestamp` field:
```json
POST _reindex
{
"size":10,
"source":{
"index":"source",
"sort":{
"timestamp":"desc"
}
},
"dest":{
"index":"destination"
}
}
```
## Transform documents during reindexing
You can transform your data during the reindexing process using the `script` option.
We recommend Painless for scripting in OpenSearch.
This command runs the source index through a Painless script that increments a `number` field inside an `account` object before copying it to the destination index:
```json
POST _reindex
{
"source":{
"index":"source"
},
"dest":{
"index":"destination"
},
"script":{
"lang":"painless",
"source":"ctx._account.number++"
}
}
```
You can also specify an ingest pipeline to transform your data during the reindexing process.
You would first have to create a pipeline with `processors` defined. You have a number of different `processors` available to use in your ingest pipeline.
Here's a sample ingest pipeline that defines a `split` processor that splits a `text` field based on a `space` separator and stores it in a new `word` field. The `script` processor is a Painless script that finds the length of the `word` field and stores it in a new `word_count` field. The `remove` processor removes the `test` field.
```json
PUT _ingest/pipeline/pipeline-test
{
"description": "Splits the text field into a list. Computes the length of the 'word' field and stores it in a new 'word_count' field. Removes the 'test' field.",
"processors": [
{
"split": {
"field": "text",
"separator": "\\s+",
"target_field": "word"
},
}
{
"script": {
"lang": "painless",
"source": "ctx.word_count = ctx.word.length"
}
},
{
"remove": {
"field": "test"
}
}
]
}
```
After creating a pipeline, you can use the `reindex` operation: