OpenSearch/docs/reference/migration/migrate_2_0.asciidoc

[[breaking-changes-2.0]]
== Breaking changes in 2.0

This section discusses the changes that you need to be aware of when migrating
your application to Elasticsearch 2.0.

=== Indices API

The <<alias-retrieving, get alias api>> will, by default produce an error response 
if a requested index does not exist. This change brings the defaults for this API in 
line with the other Indices APIs. The <<multi-index>> options can be used on a request 
to change this behavior

`GetIndexRequest.features()` now returns an array of Feature Enums instead of an array of String values.
The following deprecated methods have been removed:
* `GetIndexRequest.addFeatures(String[])` - Please use `GetIndexRequest.addFeatures(Feature[])` instead
* `GetIndexRequest.features(String[])` - Please use `GetIndexRequest.features(Feature[])` instead
* `GetIndexRequestBuilder.addFeatures(String[])` - Please use `GetIndexRequestBuilder.addFeatures(Feature[])` instead
* `GetIndexRequestBuilder.setFeatures(String[])` - Please use `GetIndexRequestBuilder.setFeatures(Feature[])` instead

=== Partial fields

Partial fields were deprecated since 1.0.0beta1 in favor of <<search-request-source-filtering,source filtering>>.

=== More Like This Field

The More Like This Field query has been removed in favor of the <<query-dsl-mlt-query, More Like This Query>>
restrained set to a specific `field`.

=== Routing

The default hash function that is used for routing has been changed from djb2 to
murmur3. This change should be transparent unless you relied on very specific
properties of djb2. This will help ensure a better balance of the document counts
between shards.

In addition, the following node settings related to routing have been deprecated:

[horizontal]

`cluster.routing.operation.hash.type`::

  This was an undocumented setting that allowed to configure which hash function
  to use for routing. `murmur3` is now enforced on new indices.

`cluster.routing.operation.use_type`::

  This was an undocumented setting that allowed to take the `_type` of the
  document into account when computing its shard (default: `false`). `false` is
  now enforced on new indices.

=== Store

The `memory` / `ram` store (`index.store.type`) option was removed in Elasticsearch 2.0.

=== Term Vectors API

Usage of `/_termvector` is deprecated, and replaced in favor of `/_termvectors`.

=== Script fields

Script fields in 1.x were only returned as a single value. So even if the return
value of a script used to be list, it would be returned as an array containing
a single value that is a list too, such as:

[source,json]
---------------
"fields": {
  "my_field": [
    [
      "v1",
      "v2"
    ]
  ]
}
---------------

In elasticsearch 2.x, scripts that return a list of values are considered as
multivalued fields. So the same example would return the following response,
with values in a single array.

[source,json]
---------------
"fields": {
  "my_field": [
    "v1",
    "v2"
  ]
}
---------------

=== Main API

Previously, calling `GET /` was giving back the http status code within the json response
in addition to the actual HTTP status code. We removed `status` field in json response.

=== Java API

Some query builders have been removed or renamed:

* `commonTerms(...)` renamed with `commonTermsQuery(...)`
* `queryString(...)` renamed with `queryStringQuery(...)`
* `simpleQueryString(...)` renamed with `simpleQueryStringQuery(...)`
* `textPhrase(...)` removed
* `textPhrasePrefix(...)` removed
* `textPhrasePrefixQuery(...)` removed
* `filtered(...)` removed. Use `filteredQuery(...)` instead.
* `inQuery(...)` removed.

==== Aggregations

The `date_histogram` aggregation now returns a `Histogram` object in the response, and the `DateHistogram` class has been removed. Similarly 
the `date_range`, `ipv4_range`, and `geo_distance` aggregations all return a `Range` object in the response, and the `IPV4Range`, `DateRange`, 
and `GeoDistance` classes have been removed. The motivation for this is to have a single response API for the Range and Histogram aggregations 
regardless of the type of data being queried.  To support this some changes were made in the `MultiBucketAggregation` interface which applies 
to all bucket aggregations:

* The `getKey()` method now returns `Object` instead of `String`. The actual object type returned depends on the type of aggregation requested 
(e.g. the `date_histogram` will return a `DateTime` object for this method whereas a `histogram` will return a `Number`).
* A `getKeyAsString()` method has been added to return the String representation of the key.
* All other `getKeyAsX()` methods have been removed.
* The `getBucketAsKey(String)` methods have been removed on all aggregations except the `filters` and `terms` aggregations.

=== Terms filter lookup caching

The terms filter lookup mechanism does not support the `cache` option anymore
and relies on the filesystem cache instead. If the lookup index is not too
large, it is recommended to make it replicated to all nodes by setting
`index.auto_expand_replicas: 0-all` in order to remove the network overhead as
well.

=== Parent parameter on update request

The `parent` parameter has been removed from the update request. Before 2.x it just set the routing parameter. The
`routing` setting should be used instead. The `parent` setting was confusing, because it had the impression that the parent
a child documents points to can be changed but this is not true.

==== Delete by query

The meaning of the `_shards` headers in the delete by query response has changed. Before version 2.0 the `total`,
`successful` and `failed` fields in the header are based on the number of primary shards. The failures on replica
shards aren't being kept track of. From version 2.0 the stats in the `_shards` header are based on all shards
of an index. The http status code is left unchanged and is only based on failures that occurred while executing on
primary shards.

=== Mappings

The setting `index.mapping.allow_type_wrapper` has been removed.  Documents should always be sent without the type as the root element.
Docs: Added breaking changes docs for Indices APIs Adds the breaking changes defaults for the change of default indices options for the GET Aliases API 2014-09-30 05:54:50 -04:00			`[[breaking-changes-2.0]]`
			`== Breaking changes in 2.0`

			`This section discusses the changes that you need to be aware of when migrating`
			`your application to Elasticsearch 2.0.`

			`=== Indices API`

			`The <<alias-retrieving, get alias api>> will, by default produce an error response`
			`if a requested index does not exist. This change brings the defaults for this API in`
			`line with the other Indices APIs. The <<multi-index>> options can be used on a request`
Search: Remove partial fields. Partial fields have been deprecated since 1.0.0Beta1 in favor of _source filtering. They will be removed in 2.0. 2014-10-17 07:24:04 -04:00			`to change this behavior`

Aggregations: Clean up response API for Aggregations This change makes the response API object for Histogram Aggregations the same for all types of Histogram, and does the same for all types of Ranges. The change removes getBucketByKey() from all aggregations except filters and terms. It also reduces the methods on the Bucket class to just getKey() and getKeyAsString(). The getKey() method returns Object and the actual Type is returns will be appropriate for the type of aggregation being run. e.g. date_histogram will return a DateTime for this method and Histogram will return a Number. 2015-01-09 10:20:05 -05:00			`GetIndexRequest.features()` now returns an array of Feature Enums instead of an array of String values.
Indices API: Fix GET index API always running all features Previous to this change all features (_alias,_mapping,_settings,_warmer) are run regardless of which features are actually requested. This change fixes the request object to resolve this bug 2014-11-07 11:39:42 -05:00			`The following deprecated methods have been removed:`
			* `GetIndexRequest.addFeatures(String[])` - Please use `GetIndexRequest.addFeatures(Feature[])` instead
			* `GetIndexRequest.features(String[])` - Please use `GetIndexRequest.features(Feature[])` instead
			* `GetIndexRequestBuilder.addFeatures(String[])` - Please use `GetIndexRequestBuilder.addFeatures(Feature[])` instead
			* `GetIndexRequestBuilder.setFeatures(String[])` - Please use `GetIndexRequestBuilder.setFeatures(Feature[])` instead

Search: Remove partial fields. Partial fields have been deprecated since 1.0.0Beta1 in favor of _source filtering. They will be removed in 2.0. 2014-10-17 07:24:04 -04:00			`=== Partial fields`

			`Partial fields were deprecated since 1.0.0beta1 in favor of <<search-request-source-filtering,source filtering>>.`
MLT Field Query: remove it from master The MLT field query is simply replaced by a MLT query set to specififc field. To simplify code maintenance we should deprecate it in 1.4 and remove it in 2.0. Closes #8238 2014-10-27 10:15:04 -04:00
			`=== More Like This Field`

			`The More Like This Field query has been removed in favor of the <<query-dsl-mlt-query, More Like This Query>>`
Switch to murmurhash3 to route documents to shards. We currently use the djb2 hash function in order to compute the shard a document should go to. Unfortunately this hash function is not very sophisticated and you can sometimes hit adversarial cases, such as numeric ids on 33 shards. Murmur3 generates hashes with a better distribution, which should avoid the adversarial cases. Here are some examples of how 100000 incremental ids are distributed to shards using either djb2 or murmur3. 5 shards: Murmur3: [19933, 19964, 19940, 20030, 20133] DJB: [20000, 20000, 20000, 20000, 20000] 3 shards: Murmur3: [33185, 33347, 33468] DJB: [30100, 30000, 39900] 33 shards: Murmur3: [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977] DJB: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0] Even if djb2 looks ideal in some cases (5 shards), the fact that the distribution of its hashes has some patterns can raise issues with some shard counts (eg. 3, or even worse 33). Some tests have been modified because they relied on implementation details of the routing hash function. Close #7954 2014-10-01 18:34:05 -04:00			restrained set to a specific `field`.

			`=== Routing`

			`The default hash function that is used for routing has been changed from djb2 to`
			`murmur3. This change should be transparent unless you relied on very specific`
			`properties of djb2. This will help ensure a better balance of the document counts`
			`between shards.`

			`In addition, the following node settings related to routing have been deprecated:`

			`[horizontal]`

			`cluster.routing.operation.hash.type`::

			`This was an undocumented setting that allowed to configure which hash function`
			to use for routing. `murmur3` is now enforced on new indices.

			`cluster.routing.operation.use_type`::

			This was an undocumented setting that allowed to take the `_type` of the
			document into account when computing its shard (default: `false`). `false` is
			`now enforced on new indices.`
[STORE] Remove `memory`/ `ram` store The RAM store is discuraged for production usage anyway and we don't test it in our randomized infrastructure. This commit removes it for `2.0` 2014-11-18 09:13:28 -05:00
			`=== Store`

			The `memory` / `ram` store (`index.store.type`) option was removed in Elasticsearch 2.0.
Term Vectors: More consistent naming for term vector[s] We speak of the term vectors of a document, where each field has an associated stored term vector. Since by default we are requesting all the term vectors of a document, the HTTP request endpoint should rather be called `_termvectors` instead of `_termvector`. The usage of `_termvector` is now deprecated, as well as the transport client call to termVector and prepareTermVector. Closes #8484 2014-11-13 09:05:09 -05:00
			`=== Term Vectors API`

			Usage of `/_termvector` is deprecated, and replaced in favor of `/_termvectors`.
Core: Fix script fields to be returned as a multivalued field when they produce a list. This change is essentially the same as #3015 but on script fields. Close #8592 2014-11-21 10:42:37 -05:00
			`=== Script fields`

			`Script fields in 1.x were only returned as a single value. So even if the return`
			`value of a script used to be list, it would be returned as an array containing`
			`a single value that is a list too, such as:`

			`[source,json]`
			`---------------`
			`"fields": {`
			`"my_field": [`
			`[`
			`"v1",`
			`"v2"`
			`]`
			`]`
			`}`
			`---------------`

			`In elasticsearch 2.x, scripts that return a list of values are considered as`
			`multivalued fields. So the same example would return the following response,`
			`with values in a single array.`

			`[source,json]`
			`---------------`
			`"fields": {`
			`"my_field": [`
			`"v1",`
			`"v2"`
			`]`
			`}`
			`---------------`
java: QueryBuilders cleanup: remove deprecated Related to #8667: Some QueryBuilders have been deprecated in 1.x branches. We removed them in 2.0. Removed ------- * `textPhrase(...)` * `textPhrasePrefix(...)` * `textPhrasePrefixQuery(...)` * `filtered(...)` * `inQuery(...)` * `commonTerms(...)` * `queryString(...)` * `simpleQueryString(...)` Closes #8721. 2014-11-26 09:24:23 -05:00
Rest: remove status code from main action Today we give the HTTP status back within the HTTP response itself and within the JSON response as well: ```sh curl localhost:9200/ ``` ```js { "status" : 200, "name" : "Red Wolf", "version" : { "number" : "2.0.0", "build_hash" : "6837a61d8a646a2ac7dc8da1ab3c4ab85d60882d", "build_timestamp" : "2014-08-19T13:55:56Z", "build_snapshot" : true, "lucene_version" : "4.9" }, "tagline" : "You Know, for Search" } ``` 2014-12-10 05:17:46 -05:00			`=== Main API`

			Previously, calling `GET /` was giving back the http status code within the json response
			in addition to the actual HTTP status code. We removed `status` field in json response.

java: QueryBuilders cleanup: remove deprecated Related to #8667: Some QueryBuilders have been deprecated in 1.x branches. We removed them in 2.0. Removed ------- * `textPhrase(...)` * `textPhrasePrefix(...)` * `textPhrasePrefixQuery(...)` * `filtered(...)` * `inQuery(...)` * `commonTerms(...)` * `queryString(...)` * `simpleQueryString(...)` Closes #8721. 2014-11-26 09:24:23 -05:00			`=== Java API`

			`Some query builders have been removed or renamed:`

			* `commonTerms(...)` renamed with `commonTermsQuery(...)`
			* `queryString(...)` renamed with `queryStringQuery(...)`
			* `simpleQueryString(...)` renamed with `simpleQueryStringQuery(...)`
			* `textPhrase(...)` removed
			* `textPhrasePrefix(...)` removed
			* `textPhrasePrefixQuery(...)` removed
			* `filtered(...)` removed. Use `filteredQuery(...)` instead.
			* `inQuery(...)` removed.

Aggregations: Clean up response API for Aggregations This change makes the response API object for Histogram Aggregations the same for all types of Histogram, and does the same for all types of Ranges. The change removes getBucketByKey() from all aggregations except filters and terms. It also reduces the methods on the Bucket class to just getKey() and getKeyAsString(). The getKey() method returns Object and the actual Type is returns will be appropriate for the type of aggregation being run. e.g. date_histogram will return a DateTime for this method and Histogram will return a Number. 2015-01-09 10:20:05 -05:00			`==== Aggregations`

			The `date_histogram` aggregation now returns a `Histogram` object in the response, and the `DateHistogram` class has been removed. Similarly
			the `date_range`, `ipv4_range`, and `geo_distance` aggregations all return a `Range` object in the response, and the `IPV4Range`, `DateRange`,
			and `GeoDistance` classes have been removed. The motivation for this is to have a single response API for the Range and Histogram aggregations
			regardless of the type of data being queried. To support this some changes were made in the `MultiBucketAggregation` interface which applies
			`to all bucket aggregations:`

			* The `getKey()` method now returns `Object` instead of `String`. The actual object type returned depends on the type of aggregation requested
			(e.g. the `date_histogram` will return a `DateTime` object for this method whereas a `histogram` will return a `Number`).
			* A `getKeyAsString()` method has been added to return the String representation of the key.
			* All other `getKeyAsX()` methods have been removed.
			* The `getBucketAsKey(String)` methods have been removed on all aggregations except the `filters` and `terms` aggregations.

Core: Remove terms filter cache. This is our only cache which is not 'exact' and might allow for stalled results. Additionally, a similar cache that we have and needs to perform lookups in other indices in order to run queries is the script index, and for this index we rely on the filesystem cache, so we should probably do the same with terms filters lookups. Close #9056 2014-12-24 05:27:45 -05:00			`=== Terms filter lookup caching`

			The terms filter lookup mechanism does not support the `cache` option anymore
			`and relies on the filesystem cache instead. If the lookup index is not too`
			`large, it is recommended to make it replicated to all nodes by setting`
			`index.auto_expand_replicas: 0-all` in order to remove the network overhead as
			`well.`
java: QueryBuilders cleanup: remove deprecated Related to #8667: Some QueryBuilders have been deprecated in 1.x branches. We removed them in 2.0. Removed ------- * `textPhrase(...)` * `textPhrasePrefix(...)` * `textPhrasePrefixQuery(...)` * `filtered(...)` * `inQuery(...)` * `commonTerms(...)` * `queryString(...)` * `simpleQueryString(...)` Closes #8721. 2014-11-26 09:24:23 -05:00
Removed parent parameter from update request, because it is just sets the routing. The routing option should be used instead. The parent a child document points to can't be updated. Closes #4538 2015-01-07 04:08:15 -05:00			`=== Parent parameter on update request`

			The `parent` parameter has been removed from the update request. Before 2.x it just set the routing parameter. The
			`routing` setting should be used instead. The `parent` setting was confusing, because it had the impression that the parent
Core: Added `_shards` header to all write responses. The header indicates to how many shard copies (primary and replicas shards) a write was supposed to go to, to how many shard copies to write succeeded and potentially captures shard failures if writing into a replica shard fails. For async writes it also includes the number of shards a write is still pending. Closes #7994 2014-09-24 08:54:50 -04:00			`a child documents points to can be changed but this is not true.`

			`==== Delete by query`

			The meaning of the `_shards` headers in the delete by query response has changed. Before version 2.0 the `total`,
			`successful` and `failed` fields in the header are based on the number of primary shards. The failures on replica
			shards aren't being kept track of. From version 2.0 the stats in the `_shards` header are based on all shards
			`of an index. The http status code is left unchanged and is only based on failures that occurred while executing on`
Mappings: Remove allow_type_wrapper setting Before Elasticsearch 1.0, the type was allowed to be passed as the root element when uploading a document. However, this was ambiguous if the mappings also contained a field with the same name as the type. The behavior was changed in 1.0 to not allow this, but a setting was added for backwards compatibility. This change removes the setting for 2.0. 2015-01-07 16:00:07 -05:00			`primary shards.`

			`=== Mappings`

			The setting `index.mapping.allow_type_wrapper` has been removed. Documents should always be sent without the type as the root element.
Aggregations: Clean up response API for Aggregations This change makes the response API object for Histogram Aggregations the same for all types of Histogram, and does the same for all types of Ranges. The change removes getBucketByKey() from all aggregations except filters and terms. It also reduces the methods on the Bucket class to just getKey() and getKeyAsString(). The getKey() method returns Object and the actual Type is returns will be appropriate for the type of aggregation being run. e.g. date_histogram will return a DateTime for this method and Histogram will return a Number. 2015-01-09 10:20:05 -05:00