OpenSearch/docs/reference/migration/migrate_5_0.asciidoc

[[breaking-changes-5.0]]
== Breaking changes in 5.0

This section discusses the changes that you need to be aware of when migrating
your application to Elasticsearch 5.0.

[IMPORTANT]
.Known networking bug in 5.0.0-alpha5
======================================================

There is a bug in the new Netty4 implementation in this release which affects any REST request with
a body that is sent in two requests, the first with an `Expect: 100-continue` header. This bug will
manifest with an exception similar to the following:

[source,txt]
----
[WARN ][http.netty4] [wtOV9Vb] caught exception while handling client http traffic, closing connection [id: 0x1320b717, L:/0:0:0:0:0:0:0:1:9200 - R:/0:0:0:0:0:0:0:1:54732]
java.lang.UnsupportedOperationException: unsupported message type: DefaultFullHttpResponse (expected: ByteBuf, FileRegion)
----

This is due to incorrect handling of the `Expect` HTTP header, and it can be
worked around in one of three ways:

* Use a client which does not add `Expect` headers (including the official clients).

* Pass a blank `Except` header, e.g.
+
[source,sh]
----
curl -H 'Expect:' ...
----

* Use Netty3 for the HTTP layer by passing the following setting at startup:
+
[source,sh]
----
./bin/elasticsearch -Ehttp.type=netty3
----

======================================================

[float]
[[migration-plugin]]
=== Migration Plugin

The https://github.com/elastic/elasticsearch-migration/blob/2.x/README.asciidoc[`elasticsearch-migration` plugin]
(compatible with Elasticsearch 2.3.0 and above) will help you to find issues
that need to be addressed when upgrading to Elasticsearch 5.0.

[float]
=== Indices created before 5.0

Elasticsearch 5.0 can read indices created in version 2.0 or above.  An
Elasticsearch 5.0 node will not start in the presence of indices created in a
version of Elasticsearch before 2.0.

[IMPORTANT]
.Reindex indices from Elasticseach 1.x or before
=========================================

Indices created in Elasticsearch 1.x or before will need to be reindexed with
Elasticsearch 2.x in order to be readable by Elasticsearch 5.x. It is not
sufficient to use the <<indices-upgrade,`upgrade`>> API.  The easiest
way to reindex old indices is to upgrade to Elasticsearch 2.3 or later and to use the
`reindex` API, or the reindex UI provided by the <<migration-plugin,Migration Plugin>>.

=========================================

The first time Elasticsearch 5.0 starts, it will automatically rename index
folders to use the index UUID instead of the index name. If you are using
<<indices-shadow-replicas,shadow replicas>> with shared data folders, first
start a single node with access to all data folders, and let it rename all
index folders before starting other nodes in the cluster.

[float]
=== Also see:

* <<breaking_50_search_changes>>
* <<breaking_50_mapping_changes>>
* <<breaking_50_percolator>>
* <<breaking_50_suggester>>
* <<breaking_50_index_apis>>
* <<breaking_50_document_api_changes>>
* <<breaking_50_settings_changes>>
* <<breaking_50_allocation>>
* <<breaking_50_http_changes>>
* <<breaking_50_rest_api_changes>>
* <<breaking_50_cat_api>>
* <<breaking_50_java_api_changes>>
* <<breaking_50_packaging>>
* <<breaking_50_plugins>>
* <<breaking_50_fs>>
* <<breaking_50_aggregations_changes>>
* <<breaking_50_scripting>>


include::migrate_5_0/search.asciidoc[]

include::migrate_5_0/mapping.asciidoc[]

include::migrate_5_0/percolator.asciidoc[]

include::migrate_5_0/suggest.asciidoc[]

include::migrate_5_0/index-apis.asciidoc[]

include::migrate_5_0/docs.asciidoc[]

include::migrate_5_0/settings.asciidoc[]

include::migrate_5_0/allocation.asciidoc[]

include::migrate_5_0/http.asciidoc[]

include::migrate_5_0/rest.asciidoc[]

include::migrate_5_0/cat.asciidoc[]

include::migrate_5_0/java.asciidoc[]

include::migrate_5_0/packaging.asciidoc[]

include::migrate_5_0/plugins.asciidoc[]

include::migrate_5_0/fs.asciidoc[]

include::migrate_5_0/aggregations.asciidoc[]

include::migrate_5_0/scripting.asciidoc[]
Move migrate_3_0 to migrate_5_0 2016-02-29 09:35:48 -05:00			`[[breaking-changes-5.0]]`
			`== Breaking changes in 5.0`
Remove the scan and count search types. These search types have been deprecated in 2.1 and 2.0 respectively, and will be removed in 3.0. 2015-09-03 09:00:52 -04:00
			`This section discusses the changes that you need to be aware of when migrating`
Move migrate_3_0 to migrate_5_0 2016-02-29 09:35:48 -05:00			`your application to Elasticsearch 5.0.`

Documented netty4 Expect bug in release notes and breaking changes 2016-08-10 04:16:25 -04:00			`[IMPORTANT]`
			`.Known networking bug in 5.0.0-alpha5`
			`======================================================`

Reword expect header bug notice This commit rewords the expect header bug notice to provide the precise details for the bug arising. In particular, the bug does not impact any request over 1024 bytes, but instead impacts any request with a body that is sent in two requests, the first with an Expect: 100-continue header. The size is irrelevant, and requests with bodies larger than 1024 bytes are okay as long as the Expect: 100-continue header is not also sent. Relates #19911 2016-08-10 10:42:58 -04:00			`There is a bug in the new Netty4 implementation in this release which affects any REST request with`
			a body that is sent in two requests, the first with an `Expect: 100-continue` header. This bug will
			`manifest with an exception similar to the following:`
Documented netty4 Expect bug in release notes and breaking changes 2016-08-10 04:16:25 -04:00
			`[source,txt]`
			`----`
			`[WARN ][http.netty4] [wtOV9Vb] caught exception while handling client http traffic, closing connection [id: 0x1320b717, L:/0:0:0:0:0:0:0:1:9200 - R:/0:0:0:0:0:0:0:1:54732]`
			`java.lang.UnsupportedOperationException: unsupported message type: DefaultFullHttpResponse (expected: ByteBuf, FileRegion)`
			`----`

			This is due to incorrect handling of the `Expect` HTTP header, and it can be
			`worked around in one of three ways:`

			* Use a client which does not add `Expect` headers (including the official clients).

			* Pass a blank `Except` header, e.g.
			`+`
			`[source,sh]`
			`----`
			`curl -H 'Expect:' ...`
			`----`

			`* Use Netty3 for the HTTP layer by passing the following setting at startup:`
			`+`
			`[source,sh]`
			`----`
			`./bin/elasticsearch -Ehttp.type=netty3`
			`----`

			`======================================================`

Update migrate_5_0.asciidoc Updated breaking changes to state that upgraded indices still need to be reindexed, and to mention the migration plugin 2016-06-23 07:10:50 -04:00			`[float]`
			`[[migration-plugin]]`
			`=== Migration Plugin`

			The https://github.com/elastic/elasticsearch-migration/blob/2.x/README.asciidoc[`elasticsearch-migration` plugin]
			`(compatible with Elasticsearch 2.3.0 and above) will help you to find issues`
Documented netty4 Expect bug in release notes and breaking changes 2016-08-10 04:16:25 -04:00			`that need to be addressed when upgrading to Elasticsearch 5.0.`
Update migrate_5_0.asciidoc Updated breaking changes to state that upgraded indices still need to be reindexed, and to mention the migration plugin 2016-06-23 07:10:50 -04:00
Add upgrader to upgrade old indices to new naming convention 2016-03-14 23:13:06 -04:00			`[float]`
			`=== Indices created before 5.0`

Improved docs explaining the index upgrade process in breaking changes 2016-06-21 12:02:55 -04:00			`Elasticsearch 5.0 can read indices created in version 2.0 or above. An`
			`Elasticsearch 5.0 node will not start in the presence of indices created in a`
			`version of Elasticsearch before 2.0.`
Add upgrader to upgrade old indices to new naming convention 2016-03-14 23:13:06 -04:00
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`[IMPORTANT]`
			`.Reindex indices from Elasticseach 1.x or before`
			`=========================================`

			`Indices created in Elasticsearch 1.x or before will need to be reindexed with`
Update migrate_5_0.asciidoc Updated breaking changes to state that upgraded indices still need to be reindexed, and to mention the migration plugin 2016-06-23 07:10:50 -04:00			`Elasticsearch 2.x in order to be readable by Elasticsearch 5.x. It is not`
			sufficient to use the <<indices-upgrade,`upgrade`>> API. The easiest
			`way to reindex old indices is to upgrade to Elasticsearch 2.3 or later and to use the`
			`reindex` API, or the reindex UI provided by the <<migration-plugin,Migration Plugin>>.
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00
			`=========================================`

Improved docs explaining the index upgrade process in breaking changes 2016-06-21 12:02:55 -04:00			`The first time Elasticsearch 5.0 starts, it will automatically rename index`
			`folders to use the index UUID instead of the index name. If you are using`
			`<<indices-shadow-replicas,shadow replicas>> with shared data folders, first`
			`start a single node with access to all data folders, and let it rename all`
			`index folders before starting other nodes in the cluster.`

Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`[float]`
			`=== Also see:`

Move migrate_3_0 to migrate_5_0 2016-02-29 09:35:48 -05:00			`* <<breaking_50_search_changes>>`
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`* <<breaking_50_mapping_changes>>`
			`* <<breaking_50_percolator>>`
Document completion suggest breaking changes 2016-04-26 03:40:30 -04:00			`* <<breaking_50_suggester>>`
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`* <<breaking_50_index_apis>>`
Reindex should retry on search failures This uses the same backoff policy we use for bulk and just retries until the request isn't rejected. Instead of `{"retries": 12}` in the response to count retries this now looks like `{"retries": {"bulk": 12", "search": 1}`. Closes #18059 2016-05-12 16:07:46 -04:00			`* <<breaking_50_document_api_changes>>`
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`* <<breaking_50_settings_changes>>`
			`* <<breaking_50_allocation>>`
Enable HTTP compression by default with compression level 3 With this commit we compress HTTP responses provided the client supports it (as indicated by the HTTP header 'Accept-Encoding'). We're also able to process compressed HTTP requests if needed. The default compression level is lowered from 6 to 3 as benchmarks have indicated that this reduces query latency with a negligible increase in network traffic. Closes #7309 2016-05-03 02:53:15 -04:00			`* <<breaking_50_http_changes>>`
Move migrate_3_0 to migrate_5_0 2016-02-29 09:35:48 -05:00			`* <<breaking_50_rest_api_changes>>`
			`* <<breaking_50_cat_api>>`
			`* <<breaking_50_java_api_changes>>`
			`* <<breaking_50_packaging>>`
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`* <<breaking_50_plugins>>`
Use `mmapfs` by default. I case any problem was discovered, you can still enable the legacy `default` directory instead. But the plan is to get rid of it in 6.0. Closes #16983 2016-04-08 06:18:02 -04:00			`* <<breaking_50_fs>>`
Use the new points API to index numeric fields. #17746 This makes all numeric fields including `date`, `ip` and `token_count` use points instead of the inverted index as a lookup structure. This is expected to perform worse for exact queries, but faster for range queries. It also requires less storage. Notes about how the change works: - Numeric mappers have been split into a legacy version that is essentially the current mapper, and a new version that uses points, eg. LegacyDateFieldMapper and DateFieldMapper. - Since new and old fields have the same names, the decision about which one to use is made based on the index creation version. - If you try to force using a legacy field on a new index or a field that uses points on an old index, you will get an exception. - IP addresses now support IPv6 via Lucene's InetAddressPoint and store them in SORTED_SET doc values using the same encoding (fixed length of 16 bytes and sortable). - The internal MappedFieldType that is stored by the new mappers does not have any of the points-related properties set. Instead, it keeps setting the index options when parsing the `index` property of mappings and does `if (fieldType.indexOptions() != IndexOptions.NONE) { // add point field }` when parsing documents. Known issues that won't fix: - You can't use numeric fields in significant terms aggregations anymore since this requires document frequencies, which points do not record. - Term queries on numeric fields will now return constant scores instead of giving better scores to the rare values. Known issues that we could work around (in follow-up PRs, this one is too large already): - Range queries on `ip` addresses only work if both the lower and upper bounds are inclusive (exclusive bounds are not exposed in Lucene). We could either decide to implement it, or drop range support entirely and tell users to query subnets using the CIDR notation instead. - Since IP addresses now use a different representation for doc values, aggregations will fail when running a terms aggregation on an ip field on a list of indices that contains both pre-5.0 and 5.0 indices. - The ip range aggregation does not work on the new ip field. We need to either implement range aggs for SORTED_SET doc values or drop support for ip ranges and tell users to use filters instead. #17700 Closes #16751 Closes #17007 Closes #11513 2016-04-01 05:07:35 -04:00			`* <<breaking_50_aggregations_changes>>`
Changed indexed scripts to be stored in the cluster state instead of the `.scripts` index. Also added max script size soft limit for stored scripts. Closes #16651 2016-02-28 07:12:05 -05:00			`* <<breaking_50_scripting>>`
Use the new points API to index numeric fields. #17746 This makes all numeric fields including `date`, `ip` and `token_count` use points instead of the inverted index as a lookup structure. This is expected to perform worse for exact queries, but faster for range queries. It also requires less storage. Notes about how the change works: - Numeric mappers have been split into a legacy version that is essentially the current mapper, and a new version that uses points, eg. LegacyDateFieldMapper and DateFieldMapper. - Since new and old fields have the same names, the decision about which one to use is made based on the index creation version. - If you try to force using a legacy field on a new index or a field that uses points on an old index, you will get an exception. - IP addresses now support IPv6 via Lucene's InetAddressPoint and store them in SORTED_SET doc values using the same encoding (fixed length of 16 bytes and sortable). - The internal MappedFieldType that is stored by the new mappers does not have any of the points-related properties set. Instead, it keeps setting the index options when parsing the `index` property of mappings and does `if (fieldType.indexOptions() != IndexOptions.NONE) { // add point field }` when parsing documents. Known issues that won't fix: - You can't use numeric fields in significant terms aggregations anymore since this requires document frequencies, which points do not record. - Term queries on numeric fields will now return constant scores instead of giving better scores to the rare values. Known issues that we could work around (in follow-up PRs, this one is too large already): - Range queries on `ip` addresses only work if both the lower and upper bounds are inclusive (exclusive bounds are not exposed in Lucene). We could either decide to implement it, or drop range support entirely and tell users to query subnets using the CIDR notation instead. - Since IP addresses now use a different representation for doc values, aggregations will fail when running a terms aggregation on an ip field on a list of indices that contains both pre-5.0 and 5.0 indices. - The ip range aggregation does not work on the new ip field. We need to either implement range aggs for SORTED_SET doc values or drop support for ip ranges and tell users to use filters instead. #17700 Closes #16751 Closes #17007 Closes #11513 2016-04-01 05:07:35 -04:00
Add -XX+AlwaysPreTouch JVM flag Enables the touching of all memory pages used by the JVM heap spaces during initialization of the HotSpot VM, which commits all memory pages at initialization time. By default, pages are committed only as they are needed. 2016-03-03 13:44:56 -05:00
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`include::migrate_5_0/search.asciidoc[]`
Script mode settings This commit converts the script mode settings to the new settings infrastructure. This is a major refactoring of the handling of script mode settings. This refactoring is necessary because these settings are determined at runtime based on the registered script engines and the registered script contexts. 2016-01-22 06:50:28 -05:00
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`include::migrate_5_0/mapping.asciidoc[]`
Script mode settings This commit converts the script mode settings to the new settings infrastructure. This is a major refactoring of the handling of script mode settings. This refactoring is necessary because these settings are determined at runtime based on the registered script engines and the registered script contexts. 2016-01-22 06:50:28 -05:00
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`include::migrate_5_0/percolator.asciidoc[]`
term vectors: The term vector APIs no longer modify the mappings if an unmapped field is found 2016-01-28 04:31:43 -05:00
Document completion suggest breaking changes 2016-04-26 03:40:30 -04:00			`include::migrate_5_0/suggest.asciidoc[]`

Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`include::migrate_5_0/index-apis.asciidoc[]`
Add note on Groovy dependencies to migration docs This commit adds a note to the migration docs regarding the reduction of the Groovy dependencies from the groovy-all artifact to the groovy artifact that was previously done in 180ab2493e96223479c2d5efd9fdd0f28fd12fee. Closes #16858 2016-02-29 09:25:39 -05:00
$polyfractal$ [DOCS] Add missing section include for reindex breaking changes 2016-05-17 16:47:20 -04:00			`include::migrate_5_0/docs.asciidoc[]`

Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`include::migrate_5_0/settings.asciidoc[]`
Add note on Groovy dependencies to migration docs This commit adds a note to the migration docs regarding the reduction of the Groovy dependencies from the groovy-all artifact to the groovy artifact that was previously done in 180ab2493e96223479c2d5efd9fdd0f28fd12fee. Closes #16858 2016-02-29 09:25:39 -05:00
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`include::migrate_5_0/allocation.asciidoc[]`
term vectors: The term vector APIs no longer modify the mappings if an unmapped field is found 2016-01-28 04:31:43 -05:00
Enable HTTP compression by default with compression level 3 With this commit we compress HTTP responses provided the client supports it (as indicated by the HTTP header 'Accept-Encoding'). We're also able to process compressed HTTP requests if needed. The default compression level is lowered from 6 to 3 as benchmarks have indicated that this reduces query latency with a negligible increase in network traffic. Closes #7309 2016-05-03 02:53:15 -04:00			`include::migrate_5_0/http.asciidoc[]`

Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`include::migrate_5_0/rest.asciidoc[]`
add migration notice 2016-02-04 10:23:58 -05:00
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`include::migrate_5_0/cat.asciidoc[]`
Make security non-optional 2.x has show so far that running with security manager is the way to go. This commit make this non-optional. Users that need to pass their own rules can still do this via the system configuration for the security manager. They can even opt out of all security that way. 2016-01-22 07:00:50 -05:00
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`include::migrate_5_0/java.asciidoc[]`
Make security non-optional 2.x has show so far that running with security manager is the way to go. This commit make this non-optional. Users that need to pass their own rules can still do this via the system configuration for the security manager. They can even opt out of all security that way. 2016-01-22 07:00:50 -05:00
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`include::migrate_5_0/packaging.asciidoc[]`
Fail closing or deleting indices during a full snapshot Closes #16321 2016-03-08 15:20:15 -05:00
Reworked 5.0 breaking changes docs 2016-03-13 16:17:48 -04:00			`include::migrate_5_0/plugins.asciidoc[]`
Fail closing or deleting indices during a full snapshot Closes #16321 2016-03-08 15:20:15 -05:00
Use `mmapfs` by default. I case any problem was discovered, you can still enable the legacy `default` directory instead. But the plan is to get rid of it in 6.0. Closes #16983 2016-04-08 06:18:02 -04:00			`include::migrate_5_0/fs.asciidoc[]`
Use the new points API to index numeric fields. #17746 This makes all numeric fields including `date`, `ip` and `token_count` use points instead of the inverted index as a lookup structure. This is expected to perform worse for exact queries, but faster for range queries. It also requires less storage. Notes about how the change works: - Numeric mappers have been split into a legacy version that is essentially the current mapper, and a new version that uses points, eg. LegacyDateFieldMapper and DateFieldMapper. - Since new and old fields have the same names, the decision about which one to use is made based on the index creation version. - If you try to force using a legacy field on a new index or a field that uses points on an old index, you will get an exception. - IP addresses now support IPv6 via Lucene's InetAddressPoint and store them in SORTED_SET doc values using the same encoding (fixed length of 16 bytes and sortable). - The internal MappedFieldType that is stored by the new mappers does not have any of the points-related properties set. Instead, it keeps setting the index options when parsing the `index` property of mappings and does `if (fieldType.indexOptions() != IndexOptions.NONE) { // add point field }` when parsing documents. Known issues that won't fix: - You can't use numeric fields in significant terms aggregations anymore since this requires document frequencies, which points do not record. - Term queries on numeric fields will now return constant scores instead of giving better scores to the rare values. Known issues that we could work around (in follow-up PRs, this one is too large already): - Range queries on `ip` addresses only work if both the lower and upper bounds are inclusive (exclusive bounds are not exposed in Lucene). We could either decide to implement it, or drop range support entirely and tell users to query subnets using the CIDR notation instead. - Since IP addresses now use a different representation for doc values, aggregations will fail when running a terms aggregation on an ip field on a list of indices that contains both pre-5.0 and 5.0 indices. - The ip range aggregation does not work on the new ip field. We need to either implement range aggs for SORTED_SET doc values or drop support for ip ranges and tell users to use filters instead. #17700 Closes #16751 Closes #17007 Closes #11513 2016-04-01 05:07:35 -04:00
			`include::migrate_5_0/aggregations.asciidoc[]`
Changed indexed scripts to be stored in the cluster state instead of the `.scripts` index. Also added max script size soft limit for stored scripts. Closes #16651 2016-02-28 07:12:05 -05:00
			`include::migrate_5_0/scripting.asciidoc[]`