OpenSearch/docs/reference/upgrade/reindex_upgrade.asciidoc

[[reindex-upgrade]]
== Reindex before upgrading

{es} can read indices created in the previous major version. If you
have indices created in 5.x or before, you must reindex or delete them
before upgrading to {version}. {es} nodes will fail to start if
incompatible indices are present. Snapshots of 5.x or earlier indices cannot be
restored to a 7.x cluster even if they were created by a 6.x cluster.
Any index created in 6.x is compatible with 7.x and does not require a reindex.


This restriction also applies to the internal indices that are used by
{kib} and the {xpack} features. Therefore, before you can use {kib} and
{xpack} features in {version}, you must ensure the internal indices have a
compatible index structure.

You have two options for reindexing old indices:

* <<reindex-upgrade-inplace, Reindex in place>> on your 6.x cluster before upgrading.
* Create a new {version} cluster and <<reindex-upgrade-remote, Reindex from remote>>.
This enables you to reindex indices that reside on clusters running any version of {es}.

.Upgrading time-based indices
*******************************************

If you use time-based indices, you likely won't need to carry
pre-6.x indices forward to {version}. Data in time-based indices
generally becomes less useful as time passes and are
deleted as they age past your retention period.

Unless you have an unusually long retention period, you can just
wait to upgrade to 6.x until all of your pre-6.x indices have
been deleted.

*******************************************


[[reindex-upgrade-inplace]]
=== Reindex in place

You can use the Upgrade Assistant in {kib} 6.8 to automatically reindex 5.x
indices you need to carry forward to {version}.

To manually reindex your old indices in place:

. Create an index with 7.x compatible mappings.
. Set the `refresh_interval` to `-1` and the `number_of_replicas` to `0` for
  efficient reindexing.
. Use the <<docs-reindex,`reindex` API>> to copy documents from the
5.x index into the new index. You can use a script to perform any necessary
modifications to the document data and metadata during reindexing.
. Reset the `refresh_interval` and `number_of_replicas` to the values
  used in the old index.
. Wait for the index status to change to `green`.
. In a single <<indices-aliases,update aliases>> request:
.. Delete the old index.
.. Add an alias with the old index name to the new index.
.. Add any aliases that existed on the old index to the new index.

ifdef::include-xpack[]
[TIP]
====
If you use {ml-features} and your {ml} indices were created before
{prev-major-version}, you must temporarily halt the tasks associated with your
{ml} jobs and {dfeeds} and prevent new jobs from opening during the reindex. Use
the <<ml-set-upgrade-mode,set upgrade mode API>> or
{ml-docs}/stopping-ml.html[stop all {dfeeds} and close all {ml} jobs].

If you use {es} {security-features}, before you reindex `.security*` internal
indices it is a good idea to create a temporary superuser account in the `file`
realm.

. On a single node, add a temporary superuser account to the `file` realm. For
example, run the <<users-command,elasticsearch-users useradd>> command:
+
--
[source,sh]
----------------------------------------------------------
bin/elasticsearch-users useradd <user_name> \
-p <password> -r superuser
----------------------------------------------------------
--

. Use these credentials when you reindex the `.security*` index. That is to say,
use them to log into {kib} and run the Upgrade Assistant or to call the
reindex API. You can use your regular administration credentials to
reindex the other internal indices.

. Delete the temporary superuser account from the file realm. For
example, run the {ref}/users-command.html[elasticsearch-users userdel] command:
+
--
[source,sh]
----------------------------------------------------------
bin/elasticsearch-users userdel <user_name>
----------------------------------------------------------
--

For more information, see <<file-realm>>.
====
endif::include-xpack[]

[[reindex-upgrade-remote]]
=== Reindex from a remote cluster

You can use <<reindex-from-remote,reindex from remote>> to migrate indices from
your old cluster to a new {version} cluster. This enables you to move to
{version} from a pre-6.8 cluster without interrupting service.

[WARNING]
=============================================

{es} provides backwards compatibility support that enables
indices from the previous major version to be upgraded to the
current major version. Skipping a major version means that you must
resolve any backward compatibility issues yourself.

ifdef::include-xpack[]
If you use {ml-features} and you're migrating indices from a 6.5 or earlier
cluster, the job and {dfeed} configuration information are not stored in an
index. You must recreate your {ml} jobs in the new cluster. If you are migrating
from a 6.6 or later cluster, it is a good idea to temporarily halt the tasks
associated with your {ml} jobs and {dfeeds} to prevent inconsistencies between
different {ml} indices that are reindexed at slightly different times. Use the
<<ml-set-upgrade-mode,set upgrade mode API>> or
{ml-docs}/stopping-ml.html[stop all {dfeeds} and close all {ml} jobs].
endif::include-xpack[]

=============================================

To migrate your indices:

. Set up a new {version} cluster and add the existing cluster to the
`reindex.remote.whitelist` in `elasticsearch.yml`.
+
--
[source,yaml]
--------------------------------------------------
reindex.remote.whitelist: oldhost:9200
--------------------------------------------------

[NOTE]
=============================================
The new cluster doesn't have to start fully-scaled out. As you migrate
indices and shift the load to the new cluster, you can add nodes to the new
cluster and remove nodes from the old one.

=============================================
--

. For each index that you need to migrate to the new cluster:

.. Create an index the appropriate mappings and settings. Set the
  `refresh_interval` to `-1` and set `number_of_replicas` to `0` for
  faster reindexing.

.. Use the <<docs-reindex,`reindex` API>> to pull documents from the
  remote index into the new {version} index:
+
--
[source,console]
--------------------------------------------------
POST _reindex
{
  "source": {
    "remote": {
      "host": "http://oldhost:9200",
      "username": "user",
      "password": "pass"
    },
    "index": "source",
    "query": {
      "match": {
        "test": "data"
      }
    }
  },
  "dest": {
    "index": "dest"
  }
}
--------------------------------------------------
// TEST[setup:host]
// TEST[s/^/PUT source\n/]
// TEST[s/oldhost:9200",/\${host}"/]
// TEST[s/"username": "user",//]
// TEST[s/"password": "pass"//]

If you run the reindex job in the background by setting `wait_for_completion`
to `false`, the reindex request returns a `task_id` you can use to
monitor progress of the reindex job with the <<tasks,task API>>:
`GET _tasks/TASK_ID`.
--

.. When the reindex job completes, set the `refresh_interval` and
  `number_of_replicas` to the desired values (the default settings are
  `30s` and `1`).

.. Once reindexing is complete and the status of the new index is `green`,
  you can delete the old index.