From db1e83884f16da731b9868bd42cbf22ea6faf600 Mon Sep 17 00:00:00 2001 From: Clinton Gormley Date: Fri, 14 Aug 2015 20:26:06 +0200 Subject: [PATCH] Docs: Rewrote the migrating-to-2.0 section --- docs/reference/migration/migrate_2_0.asciidoc | 1004 +---------------- .../migration/migrate_2_0/aggs.asciidoc | 69 ++ .../migration/migrate_2_0/crud.asciidoc | 129 +++ .../migration/migrate_2_0/index_apis.asciidoc | 42 + .../migration/migrate_2_0/java.asciidoc | 76 ++ .../migration/migrate_2_0/mapping.asciidoc | 390 +++++++ .../migration/migrate_2_0/packaging.asciidoc | 58 + .../migrate_2_0/parent_child.asciidoc | 43 + .../migration/migrate_2_0/query_dsl.asciidoc | 186 +++ .../migration/migrate_2_0/removals.asciidoc | 68 ++ .../migration/migrate_2_0/scripting.asciidoc | 102 ++ .../migration/migrate_2_0/search.asciidoc | 121 ++ .../migration/migrate_2_0/settings.asciidoc | 125 ++ .../migrate_2_0/snapshot_restore.asciidoc | 37 + .../migration/migrate_2_0/stats.asciidoc | 57 + .../migration/migrate_2_0/striping.asciidoc | 20 + 16 files changed, 1545 insertions(+), 982 deletions(-) create mode 100644 docs/reference/migration/migrate_2_0/aggs.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/crud.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/index_apis.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/java.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/mapping.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/packaging.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/parent_child.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/query_dsl.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/removals.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/scripting.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/search.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/settings.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/snapshot_restore.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/stats.asciidoc create mode 100644 docs/reference/migration/migrate_2_0/striping.asciidoc diff --git a/docs/reference/migration/migrate_2_0.asciidoc b/docs/reference/migration/migrate_2_0.asciidoc index 14ac164c69d..bc664c2920b 100644 --- a/docs/reference/migration/migrate_2_0.asciidoc +++ b/docs/reference/migration/migrate_2_0.asciidoc @@ -13,6 +13,13 @@ latest 1.x version of Elasticsearch first, in order to upgrade your indices or to delete the old indices. Elasticsearch will not start in the presence of old indices. +[float] +=== Network binds to localhost only + +Elasticsearch now binds to the loopback interface by default (usually +`127.0.0.1` or `::1`). The `network.host` setting can be specified to change +this behavior. + [float] === Elasticsearch migration plugin @@ -20,999 +27,32 @@ We have provided the https://github.com/elastic/elasticsearch-migration[Elastics to help you detect any issues that you may have when upgrading to Elasticsearch 2.0. Please install and run the plugin *before* upgrading. -=== Mapping +include::migrate_2_0/removals.asciidoc[] +include::migrate_2_0/striping.asciidoc[] -Remove file based default mappings #10870 (issue: #10620) -Validate dynamic mappings updates on the master node. #10634 (issues: #8650, #8688) -Remove the ability to have custom per-field postings and doc values formats. #9741 (issue: #8746) -Remove support for new indexes using path setting in object/nested fields or index_name in any field #9570 (issue: #6677) -Remove index_analyzer setting to simplify analyzer logic #9451 (issue: #9371) -Remove type level default analyzers #9430 (issues: #8874, #9365) -Add doc values support to boolean fields. #7961 (issues: #4678, #7851) +include::migrate_2_0/mapping.asciidoc[] +include::migrate_2_0/crud.asciidoc[] -A number of changes have been made to mappings to remove ambiguity and to -ensure that conflicting mappings cannot be created. +include::migrate_2_0/query_dsl.asciidoc[] -==== Conflicting field mappings +include::migrate_2_0/search.asciidoc[] -Fields with the same name, in the same index, in different types, must have -the same mapping, with the exception of the <>, <>, -<>, <>, <>, and <> -parameters, which may have different settings per field. +include::migrate_2_0/aggs.asciidoc[] -[source,js] ---------------- -PUT my_index -{ - "mappings": { - "type_one": { - "properties": { - "name": { <1> - "type": "string" - } - } - }, - "type_two": { - "properties": { - "name": { <1> - "type": "string", - "analyzer": "english" - } - } - } - } -} ---------------- -<1> The two `name` fields have conflicting mappings and will prevent Elasticsearch - from starting. +include::migrate_2_0/parent_child.asciidoc[] -Elasticsearch will not start in the presence of conflicting field mappings. -These indices must be deleted or reindexed using a new mapping. +include::migrate_2_0/scripting.asciidoc[] -The `ignore_conflicts` option of the put mappings API has been removed. -Conflicts can't be ignored anymore. +include::migrate_2_0/index_apis.asciidoc[] -==== Fields cannot be referenced by short name +include::migrate_2_0/snapshot_restore.asciidoc[] -A field can no longer be referenced using its short name. Instead, the full -path to the field is required. For instance: +include::migrate_2_0/packaging.asciidoc[] -[source,js] ---------------- -PUT my_index -{ - "mappings": { - "my_type": { - "properties": { - "title": { "type": "string" }, <1> - "name": { - "properties": { - "title": { "type": "string" }, <2> - "first": { "type": "string" }, - "last": { "type": "string" } - } - } - } - } - } -} ---------------- -<1> This field is referred to as `title`. -<2> This field is referred to as `name.title`. +include::migrate_2_0/settings.asciidoc[] -Previously, the two `title` fields in the example above could have been -confused with each other when using the short name `title`. +include::migrate_2_0/stats.asciidoc[] -=== Type name prefix removed - -Previously, two fields with the same name in two different types could -sometimes be disambiguated by prepending the type name. As a side effect, it -would add a filter on the type name to the relevant query. This feature was -ambiguous -- a type name could be confused with a field name -- and didn't -work everywhere e.g. aggregations. - -Instead, fields should be specified with the full path, but without a type -name prefix. If you wish to filter by the `_type` field, either specify the -type in the URL or add an explicit filter. - -The following example query in 1.x: - -[source,js] ----------------------------- -GET my_index/_search -{ - "query": { - "match": { - "my_type.some_field": "quick brown fox" - } - } -} ----------------------------- - -would be rewritten in 2.0 as: - -[source,js] ----------------------------- -GET my_index/my_type/_search <1> -{ - "query": { - "match": { - "some_field": "quick brown fox" <2> - } - } -} ----------------------------- -<1> The type name can be specified in the URL to act as a filter. -<2> The field name should be specified without the type prefix. - -==== Field names may not contain dots - -In 1.x, it was possible to create fields with dots in their name, for -instance: - -[source,js] ----------------------------- -PUT my_index -{ - "mappings": { - "my_type": { - "properties": { - "foo.bar": { <1> - "type": "string" - }, - "foo": { - "properties": { - "bar": { <1> - "type": "string" - } - } - } - } - } - } -} ----------------------------- -<1> These two fields cannot be distinguised as both are referred to as `foo.bar`. - -You can no longer create fields with dots in the name. - -==== Type names may not start with a dot - -In 1.x, Elasticsearch would issue a warning if a type name included a dot, -e.g. `my.type`. Now that type names are no longer used to distinguish between -fields in differnt types, this warning has been relaxed: type names may now -contain dots, but they may not *begin* with a dot. The only exception to this -is the special `.percolator` type. - -==== Types may no longer be deleted - -In 1.x it was possible to delete a type mapping, along with all of the -documents of that type, using the delete mapping API. This is no longer -supported, because remnants of the fields in the type could remain in the -index, causing corruption later on. - -==== Type meta-fields - -The <> associated with had configuration options -removed, to make them more reliable: - -* `_id` configuration can no longer be changed. If you need to sort, use the <> field instead. -* `_type` configuration can no longer be changed. -* `_index` configuration can no longer be changed. -* `_routing` configuration is limited to marking routing as required. -* `_field_names` configuration is limited to disabling the field. -* `_size` configuration is limited to enabling the field. -* `_timestamp` configuration is limited to enabling the field, setting format and default value. -* `_boost` has been removed. -* `_analyzer` has been removed. - -Importantly, *meta-fields can no longer be specified as part of the document -body.* Instead, they must be specified in the query string parameters. For -instance, in 1.x, the `routing` could be specified as follows: - -[source,json] ------------------------------ -PUT my_index -{ - "mappings": { - "my_type": { - "_routing": { - "path": "group" <1> - }, - "properties": { - "group": { <1> - "type": "string" - } - } - } - } -} - -PUT my_index/my_type/1 <2> -{ - "group": "foo" -} ------------------------------ -<1> This 1.x mapping tells Elasticsearch to extract the `routing` value from the `group` field in the document body. -<2> This indexing request uses a `routing` value of `foo`. - -In 2.0, the routing must be specified explicitly: - -[source,json] ------------------------------ -PUT my_index -{ - "mappings": { - "my_type": { - "_routing": { - "required": true <1> - }, - "properties": { - "group": { - "type": "string" - } - } - } - } -} - -PUT my_index/my_type/1?routing=bar <2> -{ - "group": "foo" -} ------------------------------ -<1> Routing can be marked as required to ensure it is not forgotten during indexing. -<2> This indexing request uses a `routing` value of `bar`. - -==== Other mapping changes - -* The setting `index.mapping.allow_type_wrapper` has been removed. Documents should always be sent without the type as the root element. -* The `binary` field does not support the `compress` and `compress_threshold` options anymore. - - - - -=== Networking - -Elasticsearch now binds to the loopback interface by default (usually 127.0.0.1 -or ::1), the setting `network.host` can be specified to change this behavior. - -=== Rivers removal - -Elasticsearch does not support rivers anymore. While we had first planned to -keep them around to ease migration, keeping support for rivers proved to be -challenging as it conflicted with other important changes that we wanted to -bring to 2.0 like synchronous dynamic mappings updates, so we eventually -decided to remove them entirely. See -https://www.elastic.co/blog/deprecating_rivers for more background about why -we are moving away from rivers. - -=== Indices API - -The <> will, by default produce an error response -if a requested index does not exist. This change brings the defaults for this API in -line with the other Indices APIs. The <> options can be used on a request -to change this behavior - -`GetIndexRequest.features()` now returns an array of Feature Enums instead of an array of String values. - -The following deprecated methods have been removed: - -* `GetIndexRequest.addFeatures(String[])` - Please use `GetIndexRequest.addFeatures(Feature[])` instead -* `GetIndexRequest.features(String[])` - Please use `GetIndexRequest.features(Feature[])` instead -* `GetIndexRequestBuilder.addFeatures(String[])` - Please use `GetIndexRequestBuilder.addFeatures(Feature[])` instead -* `GetIndexRequestBuilder.setFeatures(String[])` - Please use `GetIndexRequestBuilder.setFeatures(Feature[])` instead - -=== Partial fields - -Partial fields were deprecated since 1.0.0beta1 in favor of <>. - -=== More Like This - -The More Like This API and the More Like This Field query have been removed in -favor of the <>. - -The parameter `percent_terms_to_match` has been removed in favor of -`minimum_should_match`. - -=== Routing - -The default hash function that is used for routing has been changed from djb2 to -murmur3. This change should be transparent unless you relied on very specific -properties of djb2. This will help ensure a better balance of the document counts -between shards. - -In addition, the following node settings related to routing have been deprecated: - -[horizontal] - -`cluster.routing.operation.hash.type`:: - - This was an undocumented setting that allowed to configure which hash function - to use for routing. `murmur3` is now enforced on new indices. - -`cluster.routing.operation.use_type`:: - - This was an undocumented setting that allowed to take the `_type` of the - document into account when computing its shard (default: `false`). `false` is - now enforced on new indices. - -=== Async replication - -The `replication` parameter has been removed from all CRUD operations (index, -update, delete, bulk). These operations are now synchronous -only, and a request will only return once the changes have been replicated to -all active shards in the shard group. - -=== Store - -The `memory` / `ram` store (`index.store.type`) option was removed in Elasticsearch 2.0. - -=== Term Vectors API - -Usage of `/_termvector` is deprecated, and replaced in favor of `/_termvectors`. - -=== Script fields - -Script fields in 1.x were only returned as a single value. So even if the return -value of a script used to be list, it would be returned as an array containing -a single value that is a list too, such as: - -[source,js] ---------------- -"fields": { - "my_field": [ - [ - "v1", - "v2" - ] - ] -} ---------------- - -In elasticsearch 2.x, scripts that return a list of values are considered as -multivalued fields. So the same example would return the following response, -with values in a single array. - -[source,js] ---------------- -"fields": { - "my_field": [ - "v1", - "v2" - ] -} ---------------- - -=== Main API - -Previously, calling `GET /` was giving back the http status code within the json response -in addition to the actual HTTP status code. We removed `status` field in json response. - -=== Java API - -`org.elasticsearch.index.queries.FilterBuilders` has been removed as part of the merge of -queries and filters. These filters are now available in `QueryBuilders` with the same name. -All methods that used to accept a `FilterBuilder` now accept a `QueryBuilder` instead. - -In addition some query builders have been removed or renamed: - -* `commonTerms(...)` renamed with `commonTermsQuery(...)` -* `queryString(...)` renamed with `queryStringQuery(...)` -* `simpleQueryString(...)` renamed with `simpleQueryStringQuery(...)` -* `textPhrase(...)` removed -* `textPhrasePrefix(...)` removed -* `textPhrasePrefixQuery(...)` removed -* `filtered(...)` removed. Use `filteredQuery(...)` instead. -* `inQuery(...)` removed. - -=== Aggregations - -The `date_histogram` aggregation now returns a `Histogram` object in the response, and the `DateHistogram` class has been removed. Similarly -the `date_range`, `ipv4_range`, and `geo_distance` aggregations all return a `Range` object in the response, and the `IPV4Range`, `DateRange`, -and `GeoDistance` classes have been removed. The motivation for this is to have a single response API for the Range and Histogram aggregations -regardless of the type of data being queried. To support this some changes were made in the `MultiBucketAggregation` interface which applies -to all bucket aggregations: - -* The `getKey()` method now returns `Object` instead of `String`. The actual object type returned depends on the type of aggregation requested -(e.g. the `date_histogram` will return a `DateTime` object for this method whereas a `histogram` will return a `Number`). -* A `getKeyAsString()` method has been added to return the String representation of the key. -* All other `getKeyAsX()` methods have been removed. -* The `getBucketAsKey(String)` methods have been removed on all aggregations except the `filters` and `terms` aggregations. - -The `histogram` and the `date_histogram` aggregation now support a simplified `offset` option that replaces the previous `pre_offset` and -`post_offset` rounding options. Instead of having to specify two separate offset shifts of the underlying buckets, the `offset` option -moves the bucket boundaries in positive or negative direction depending on its argument. - -The `date_histogram` options for `pre_zone` and `post_zone` are replaced by the `time_zone` option. The behavior of `time_zone` is -equivalent to the former `pre_zone` option. Setting `time_zone` to a value like "+01:00" now will lead to the bucket calculations -being applied in the specified time zone but In addition to this, also the `pre_zone_adjust_large_interval` is removed because we -now always return dates and bucket keys in UTC. - -Both the `histogram` and `date_histogram` aggregations now have a default `min_doc_count` of `0` instead of `1` previously. - -`include`/`exclude` filtering on the `terms` aggregation now uses the same syntax as regexp queries instead of the Java syntax. While simple -regexps should still work, more complex ones might need some rewriting. Also, the `flags` parameter is not supported anymore. - -=== Terms filter lookup caching - -The terms filter lookup mechanism does not support the `cache` option anymore -and relies on the filesystem cache instead. If the lookup index is not too -large, it is recommended to make it replicated to all nodes by setting -`index.auto_expand_replicas: 0-all` in order to remove the network overhead as -well. - -=== Delete by query - -The meaning of the `_shards` headers in the delete by query response has changed. Before version 2.0 the `total`, -`successful` and `failed` fields in the header are based on the number of primary shards. The failures on replica -shards aren't being kept track of. From version 2.0 the stats in the `_shards` header are based on all shards -of an index. The http status code is left unchanged and is only based on failures that occurred while executing on -primary shards. - -=== Delete api with missing routing when required - -Delete api requires a routing value when deleting a document belonging to a type that has routing set to required in its -mapping, whereas previous elasticsearch versions would trigger a broadcast delete on all shards belonging to the index. -A `RoutingMissingException` is now thrown instead. - - -==== Default date format now is `strictDateOptionalTime` - -Instead of `dateOptionalTime` the new default date format now is `strictDateOptionalTime`, -which is more strict in parsing dates. This means, that dates now need to have a four digit year, -a two-digit month, day, hour, minute and second. This means, you may need to preprend a part of the date -with a zero to make it conform or switch back to the old `dateOptionalTime` format. - -==== Date format does not support unix timestamps by default - -In earlier versions of elasticsearch, every timestamp was always tried to be parsed as -as unix timestamp first. This means, even when specifying a date format like -`dateOptionalTime`, one could supply unix timestamps instead of a ISO8601 formatted -date. - -This is not supported anymore. If you want to store unix timestamps, you need to specify -the appropriate formats in the mapping, namely `epoch_second` or `epoch_millis`. - -In addition the `numeric_resolution` mapping parameter is ignored. Use the -`epoch_second` and `epoch_millis` date formats instead. - -==== Source field limitations -The `_source` field could previously be disabled dynamically. Since this field -is a critical piece of many features like the Update API, it is no longer -possible to disable. - -The options for `compress` and `compress_threshold` have also been removed. -The source field is already compressed. To minimize the storage cost, -set `index.codec: best_compression` in index settings. - -==== Boolean fields - -Boolean fields used to have a string fielddata with `F` meaning `false` and `T` -meaning `true`. They have been refactored to use numeric fielddata, with `0` -for `false` and `1` for `true`. As a consequence, the format of the responses of -the following APIs changed when applied to boolean fields: `0`/`1` is returned -instead of `F`/`T`: - - - <> - - <> - - <> - -In addition, terms aggregations use a custom formatter for boolean (like for -dates and ip addresses, which are also backed by numbers) in order to return -the user-friendly representation of boolean fields: `false`/`true`: - -[source,js] ---------------- -"buckets": [ - { - "key": 0, - "key_as_string": "false", - "doc_count": 42 - }, - { - "key": 1, - "key_as_string": "true", - "doc_count": 12 - } -] ---------------- - -==== Murmur3 Fields -Fields of type `murmur3` can no longer change `doc_values` or `index` setting. -They are always stored with doc values, and not indexed. - -==== Config based mappings -The ability to specify mappings in configuration files has been removed. To specify -default mappings that apply to multiple indexes, use index templates. - -The following settings are no longer valid: - -* `index.mapper.default_mapping_location` -* `index.mapper.default_percolator_mapping_location` - -=== Codecs - -It is no longer possible to specify per-field postings and doc values formats -in the mappings. This setting will be ignored on indices created before -elasticsearch 2.0 and will cause mapping parsing to fail on indices created on -or after 2.0. For old indices, this means that new segments will be written -with the default postings and doc values formats of the current codec. - -It is still possible to change the whole codec by using the `index.codec` -setting. Please however note that using a non-default codec is discouraged as -it could prevent future versions of Elasticsearch from being able to read the -index. - -=== Scripting settings - -Removed support for `script.disable_dynamic` node setting, replaced by -fine-grained script settings described in the <>. -The following setting previously used to enable dynamic scripts: - -[source,yaml] ---------------- -script.disable_dynamic: false ---------------- - -can be replaced with the following two settings in `elasticsearch.yml` that -achieve the same result: - -[source,yaml] ---------------- -script.inline: on -script.indexed: on ---------------- - -=== Script parameters - -Deprecated script parameters `id`, `file`, `scriptField`, `script_id`, `script_file`, -`script`, `lang` and `params`. The <> should be used in their place. - -The deprecated script parameters have been removed from the Java API so applications using the Java API will -need to be updated. - -=== Groovy scripts sandbox - -The groovy sandbox and related settings have been removed. Groovy is now a non -sandboxed scripting language, without any option to turn the sandbox on. - -=== Plugins making use of scripts - -Plugins that make use of scripts must register their own script context through -`ScriptModule`. Script contexts can be used as part of fine-grained settings to -enable/disable scripts selectively. - -=== Thrift and memcached transport - -The thrift and memcached transport plugins are no longer supported. Instead, use -either the HTTP transport (enabled by default) or the node or transport Java client. - -=== `search_type=count` deprecation - -The `count` search type has been deprecated. All benefits from this search type can -now be achieved by using the `query_then_fetch` search type (which is the -default) and setting `size` to `0`. - -=== The count api internally uses the search api - -The count api is now a shortcut to the search api with `size` set to 0. As a -result, a total failure will result in an exception being returned rather -than a normal response with `count` set to `0` and shard failures. - -=== JSONP support - -JSONP callback support has now been removed. CORS should be used to access Elasticsearch -over AJAX instead: - -[source,yaml] ---------------- -http.cors.enabled: true -http.cors.allow-origin: /https?:\/\/localhost(:[0-9]+)?/ ---------------- - -=== CORS allowed origins - -The CORS allowed origins setting, `http.cors.allow-origin`, no longer has a default value. Previously, the default value -was `*`, which would allow CORS requests from any origin and is considered insecure. The `http.cors.allow-origin` setting -should be specified with only the origins that should be allowed, like so: - -[source,yaml] ---------------- -http.cors.allow-origin: /https?:\/\/localhost(:[0-9]+)?/ ---------------- - -=== Cluster state REST api - -The cluster state api doesn't return the `routing_nodes` section anymore when -`routing_table` is requested. The newly introduced `routing_nodes` flag can -be used separately to control whether `routing_nodes` should be returned. - -=== Query DSL - -Change to ranking behaviour: single-term queries on numeric fields now score in the same way as string fields (use of IDF, norms if enabled). -Previously, term queries on numeric fields were deliberately prevented from using the usual Lucene scoring logic and this behaviour was undocumented and, to some, unexpected. -If the introduction of scoring to numeric fields is undesirable for your query clauses the fix is simple: wrap them in a `constant_score` or use a `filter` expression instead. - -The `filtered` query is deprecated. Instead you should use a `bool` query with -a `must` clause for the query and a `filter` clause for the filter. For instance -the below query: - -[source,js] ---------------- -{ - "filtered": { - "query": { - // query - }, - "filter": { - // filter - } - } -} ---------------- -can be replaced with -[source,js] ---------------- -{ - "bool": { - "must": { - // query - }, - "filter": { - // filter - } - } -} ---------------- -and will produce the same scores. - -The `fuzzy_like_this` and `fuzzy_like_this_field` queries have been removed. - -The `limit` filter is deprecated and becomes a no-op. You can achieve similar -behaviour using the <> parameter. - -`or` and `and` on the one hand and `bool` on the other hand used to have -different performance characteristics depending on the wrapped filters. This is -fixed now, as a consequence the `or` and `and` filters are now deprecated in -favour or `bool`. - -The `execution` option of the `terms` filter is now deprecated and ignored if -provided. - -The `_cache` and `_cache_key` parameters of filters are deprecated in the REST -layer and removed in the Java API. In case they are specified they will be -ignored. Instead filters are always used as their own cache key and elasticsearch -makes decisions by itself about whether it should cache filters based on how -often they are used. - -Java plugins that register custom queries can do so by using the -`IndicesQueriesModule#addQuery(Class)` method. Other -ways to register custom queries are not supported anymore. - -==== Query/filter merge - -Elasticsearch no longer makes a difference between queries and filters in the -DSL; it detects when scores are not needed and automatically optimizes the -query to not compute scores and optionally caches the result. - -As a consequence the `query` filter serves no purpose anymore and is deprecated. - -=== Timezone for date field - -Specifying the `time_zone` parameter on queries or aggregations of `date` type fields -must now be either an ISO 8601 UTC offset, or a timezone id. For example, the value -`+1:00` must now be `+01:00`. - -=== Snapshot and Restore - -Locations of the shared file system repositories and the URL repositories with `file:` URLs has to be now registered -using `path.repo` setting. The `path.repo` setting can contain one or more repository locations: - -[source,yaml] ---------------- -path.repo: ["/mnt/daily", "/mnt/weekly"] ---------------- - -If the repository location is specified as an absolute path it has to start with one of the locations -specified in `path.repo`. If the location is specified as a relative path, it will be resolved against the first -location specified in the `path.repo` setting. - -URL repositories with `http:`, `https:`, and `ftp:` URLs has to be whitelisted by specifying allowed URLs in the -`repositories.url.allowed_urls` setting. This setting supports wildcards in the place of host, path, query, and -fragment. For example: - -[source,yaml] ------------------------------------ -repositories.url.allowed_urls: ["http://www.example.org/root/*", "https://*.mydomain.com/*?*#*"] ------------------------------------ - -The obsolete parameters `expand_wildcards_open` and `expand_wildcards_close` are no longer -supported by the snapshot and restore operations. These parameters have been replaced by -a single `expand_wildcards` parameter. See <> for more. - -=== `_shutdown` API - -The `_shutdown` API has been removed without a replacement. Nodes should be managed via operating -systems and the provided start/stop scripts. - -=== Analyze API - -* The Analyze API return 0 as first Token's position instead of 1. -* The `text()` method on `AnalyzeRequest` now returns `String[]` instead of `String`. - -=== Multiple data.path striping - -Previously, if the `data.path` setting listed multiple data paths, then a -shard would be ``striped'' across all paths by writing a whole file to each -path in turn (in accordance with the `index.store.distributor` setting). The -result was that the files from a single segment in a shard could be spread -across multiple disks, and the failure of any one disk could corrupt multiple -shards. - -This striping is no longer supported. Instead, different shards may be -allocated to different paths, but all of the files in a single shard will be -written to the same path. - -If striping is detected while starting Elasticsearch 2.0.0 or later, all of -the files belonging to the same shard will be migrated to the same path. If -there is not enough disk space to complete this migration, the upgrade will be -cancelled and can only be resumed once enough disk space is made available. - -The `index.store.distributor` setting has also been removed. - -=== Hunspell dictionary configuration - -The parameter `indices.analysis.hunspell.dictionary.location` has been removed, -and `/hunspell` is always used. - -=== Java API Transport API construction - -The `TransportClient` construction code has changed, it now uses the builder -pattern. Instead of using: - -[source,java] --------------------------------------------------- -Settings settings = Settings.settingsBuilder() - .put("cluster.name", "myClusterName").build(); -Client client = new TransportClient(settings); --------------------------------------------------- - -Use: - -[source,java] --------------------------------------------------- -Settings settings = Settings.settingsBuilder() - .put("cluster.name", "myClusterName").build(); -Client client = TransportClient.builder().settings(settings).build(); --------------------------------------------------- - -=== Logging - -Log messages are now truncated at 10,000 characters. This can be changed in the -`logging.yml` configuration file. - -[float] -=== Removed `top_children` query - -The `top_children` query has been removed in favour of the `has_child` query. The `top_children` query wasn't always faster -than the `has_child` query and the `top_children` query was often inaccurate. The total hits and any aggregations in the -same search request will likely be off if `top_children` was used. - -=== Removed file based index templates -Index templates can no longer be configured on disk. Use the `_template` API instead. - -[float] -=== Removed `id_cache` from stats apis - -Removed `id_cache` metric from nodes stats, indices stats and cluster stats apis. This metric has also been removed -from the shards cat, indices cat and nodes cat apis. Parent/child memory is now reported under fielddata, because it -has internally be using fielddata for a while now. - -To just see how much parent/child related field data is taking, the `fielddata_fields` option can be used on the stats -apis. Indices stats example: - -[source,js] --------------------------------------------------- -curl -XGET "http://localhost:9200/_stats/fielddata?pretty&human&fielddata_fields=_parent" --------------------------------------------------- - -Parent/child is using field data for the `_parent` field since version `1.1.0`, but the memory stats for the `_parent` -field were still shown under `id_cache` metric in the stats apis for backwards compatible reasons between 1.x versions. - -Before version `1.1.0` the parent/child had its own in-memory data structures for id values in the `_parent` field. - -[float] -=== Removed `id_cache` from clear cache api - -Removed `id_cache` option from the clear cache apis. The `fielddata` option should be used to clear `_parent` field -from fielddata. - -[float] -=== Highlighting - -The default value for the `require_field_match` option is `true` rather than -`false`, meaning that the highlighters will take the fields that were queried -into account by default. That means for instance that highlighting any field -when querying the `_all` field will produce no highlighted snippets by default, -given that the match was on the `_all` field only. Querying the same fields -that need to be highlighted is the cleaner solution to get highlighted snippets -back. Otherwise `require_field_match` option can be set to `false` to ignore -field names completely when highlighting. - -The postings highlighter doesn't support the `require_field_match` option -anymore, it will only highlight fields that were queried. - -The `match` query with type set to `match_phrase_prefix` is not supported by the -postings highlighter. No highlighted snippets will be returned. - -[float] -=== Parent/child - -Parent/child has been rewritten completely to reduce memory usage and to execute -`has_child` and `has_parent` queries faster and more efficient. The `_parent` field -uses doc values by default. The refactored and improved implementation is only active -for indices created on or after version 2.0. - -In order to benefit for all performance and memory improvements we recommend to reindex all -indices that have the `_parent` field created before was upgraded to 2.0. - -The following breaks in backwards compatability have been made on indices with the `_parent` field -created on or after clusters with version 2.0: -* The `type` option on the `_parent` field can only point to a parent type that doesn't exist yet, - so this means that an existing type/mapping can no longer become a parent type. -* The `has_child` and `has_parent` queries can no longer be use in alias filters. - -=== Meta fields returned under the top-level json object - -When selecting meta fields such as `_routing` or `_timestamp`, the field values -are now directly put as a top-level property of the json objet, instead of being -put under `fields` like regular stored fields. - -[source,sh] ---------------- -curl -XGET 'localhost:9200/test/_search?fields=_timestamp,foo' ---------------- - -[source,js] ---------------- -{ - [...] - "hits": { - "total": 1, - "max_score": 1, - "hits": [ - { - "_index": "test", - "_type": "test", - "_id": "1", - "_score": 1, - "_timestamp": 10000000, - "fields": { - "foo" : [ "bar" ] - } - } - ] - } -} ---------------- - -=== Settings for resource watcher have been renamed - -The setting names for configuring the resource watcher have been renamed -to prevent clashes with the watcher plugin - -* `watcher.enabled` is now `resource.reload.enabled` -* `watcher.interval` is now `resource.reload.interval` -* `watcher.interval.low` is now `resource.reload.interval.low` -* `watcher.interval.medium` is now `resource.reload.interval.medium` -* `watcher.interval.high` is now `resource.reload.interval.high` - -=== Percolator stats - -Changed the `percolate.getTime` stat (total time spent on percolating) to `percolate.time` state. - -=== Plugin Manager for official plugins - -Some of the elasticsearch official plugins have been moved to elasticsearch repository and will be released at the -same time as elasticsearch itself, using the same version number. - -In that case, the plugin manager can now use a simpler form to identify an official plugin. Instead of: - -[source,sh] ---------------- -bin/plugin install elasticsearch/plugin_name/version ---------------- - -You can use: - -[source,sh] ---------------- -bin/plugin install plugin_name ---------------- - -The plugin manager will recognize this form and will be able to download the right version for your elasticsearch -version. - -For older versions of elasticsearch, you still have to use the older form. - -For the record, official plugins which can use this new simplified form are: - -* elasticsearch-analysis-icu -* elasticsearch-analysis-kuromoji -* elasticsearch-analysis-phonetic -* elasticsearch-analysis-smartcn -* elasticsearch-analysis-stempel -* elasticsearch-cloud-aws -* elasticsearch-cloud-azure -* elasticsearch-cloud-gce -* elasticsearch-delete-by-query -* elasticsearch-lang-javascript -* elasticsearch-lang-python - -=== `/bin/elasticsearch` version needs `-V` parameter - -Due to switching to elasticsearchs internal command line parsing -infrastructure for the pluginmanager and the elasticsearch start up -script, the `-v` parameter now stands for `--verbose`, where as `-V` or -`--version` can be used to show the Elasticsearch version and exit. - -=== `/bin/elasticsearch` dynamic parameters must come after static ones - -If you are setting configuration options like cluster name or node name via -the commandline, you have to ensure, that the static options like pid file -path or daemonizing always come first, like this - -``` -/bin/elasticsearch -d -p /tmp/foo.pid --http.cors.enabled=true --http.cors.allow-origin='*' - -``` - -For a list of those static parameters, run `/bin/elasticsearch -h` - -=== Aliases - -Fields used in alias filters no longer have to exist in the mapping upon alias creation time. Alias filters are now -parsed at request time and then the fields in filters are resolved from the mapping, whereas before alias filters were -parsed at alias creation time and the parsed form was kept around in memory. - - -=== _analyze API - -The `prefer_local` has been removed from the _analyze api. The _analyze api is a light operation and the caller shouldn't -be concerned about whether it executes on the node that receives the request or another node. - -=== Shadow replicas - -The `node.enable_custom_paths` setting has been removed and replaced by the -`path.shared_data` setting to allow shadow replicas with custom paths to work -with the security manager. For example, if your previous configuration had: - -``` -node.enable_custom_paths: true -``` - -And you created an index using shadow replicas with `index.data_path` set to -`/opt/data/my_index` with the following: - -[source,js] --------------------------------------------------- -curl -XPUT 'localhost:9200/my_index' -d ' -{ - "index" : { - "number_of_shards" : 1, - "number_of_replicas" : 4, - "data_path": "/opt/data/my_index", - "shadow_replicas": true - } -}' --------------------------------------------------- - -For 2.0, you will need to set `path.shared_data` to a parent directory of the -index's data_path, so: - -``` -path.shared_data: /opt/data -``` +include::migrate_2_0/java.asciidoc[] \ No newline at end of file diff --git a/docs/reference/migration/migrate_2_0/aggs.asciidoc b/docs/reference/migration/migrate_2_0/aggs.asciidoc new file mode 100644 index 00000000000..8134812f912 --- /dev/null +++ b/docs/reference/migration/migrate_2_0/aggs.asciidoc @@ -0,0 +1,69 @@ +=== Aggregation changes + +==== Min doc count defaults to zero + +Both the `histogram` and `date_histogram` aggregations now have a default +`min_doc_count` of `0` instead of `1`. + +==== Timezone for date field + +Specifying the `time_zone` parameter in queries or aggregations on fields of +type `date` must now be either an ISO 8601 UTC offset, or a timezone id. For +example, the value `+1:00` must now be written as `+01:00`. + +==== Time zones and offsets + +The `histogram` and the `date_histogram` aggregation now support a simplified +`offset` option that replaces the previous `pre_offset` and `post_offset` +rounding options. Instead of having to specify two separate offset shifts of +the underlying buckets, the `offset` option moves the bucket boundaries in +positive or negative direction depending on its argument. + +The `date_histogram` options for `pre_zone` and `post_zone` are replaced by +the `time_zone` option. The behavior of `time_zone` is equivalent to the +former `pre_zone` option. Setting `time_zone` to a value like "+01:00" now +will lead to the bucket calculations being applied in the specified time zone. +The `key` is returned as the timestamp in UTC, but the `key_as_string` is +returned in the time zone specified. + +In addition to this, the `pre_zone_adjust_large_interval` is removed because +we now always return dates and bucket keys in UTC. + +==== Including/excluding terms + +`include`/`exclude` filtering on the `terms` aggregation now uses the same +syntax as <> instead of the Java regular +expression syntax. While simple regexps should still work, more complex ones +might need some rewriting. Also, the `flags` parameter is no longer supported. + +==== Boolean fields + +Aggregations on `boolean` fields will now return `0` and `1` as keys, and +`"true"` and `"false"` as string keys. See <> for more +information. + + +==== Java aggregation classes + +The `date_histogram` aggregation now returns a `Histogram` object in the +response, and the `DateHistogram` class has been removed. Similarly the +`date_range`, `ipv4_range`, and `geo_distance` aggregations all return a +`Range` object in the response, and the `IPV4Range`, `DateRange`, and +`GeoDistance` classes have been removed. + +The motivation for this is to have a single response API for the Range and +Histogram aggregations regardless of the type of data being queried. To +support this some changes were made in the `MultiBucketAggregation` interface +which applies to all bucket aggregations: + +* The `getKey()` method now returns `Object` instead of `String`. The actual + object type returned depends on the type of aggregation requested (e.g. the + `date_histogram` will return a `DateTime` object for this method whereas a + `histogram` will return a `Number`). +* A `getKeyAsString()` method has been added to return the String + representation of the key. +* All other `getKeyAsX()` methods have been removed. +* The `getBucketAsKey(String)` methods have been removed on all aggregations + except the `filters` and `terms` aggregations. + + diff --git a/docs/reference/migration/migrate_2_0/crud.asciidoc b/docs/reference/migration/migrate_2_0/crud.asciidoc new file mode 100644 index 00000000000..060cfc7277d --- /dev/null +++ b/docs/reference/migration/migrate_2_0/crud.asciidoc @@ -0,0 +1,129 @@ +=== CRUD and routing changes + +==== Explicit custom routing + +Custom `routing` values can no longer be extracted from the document body, but +must be specified explicitly as part of the query string, or in the metadata +line in the <> API. See <> for an +example. + +==== Routing hash function + +The default hash function that is used for routing has been changed from +`djb2` to `murmur3`. This change should be transparent unless you relied on +very specific properties of `djb2`. This will help ensure a better balance of +the document counts between shards. + +In addition, the following routing-related node settings have been deprecated: + +`cluster.routing.operation.hash.type`:: + + This was an undocumented setting that allowed to configure which hash function + to use for routing. `murmur3` is now enforced on new indices. + +`cluster.routing.operation.use_type`:: + + This was an undocumented setting that allowed to take the `_type` of the + document into account when computing its shard (default: `false`). `false` is + now enforced on new indices. + +==== Delete API with custom routing + +The delete API used to be broadcast to all shards in the index which meant +that, when using custom routing, the `routing` parameter was optional. Now, +the delete request is forwarded only to the document holding the shard. If you +are using custom routing then you should specify the `routing` value when +deleting a document, just as is already required for the `index`, `create`, +and `update` APIs. + +To make sure that you never forget a routing value, make routing required with +the following mapping: + +[source,js] +--------------------------- +PUT my_index +{ + "mappings": { + "my_type": { + "_routing": { + "required": true + } + } + } +} +--------------------------- + +==== All stored meta-fields returned by default + +Previously, meta-fields like `_routing`, `_timestamp`, etc would only be +included in a GET request if specifically requested with the `fields` +parameter. Now, all meta-fields which have stored values will be returned by +default. Additionally, they are now returned at the top level (along with +`_index`, `_type`, and `_id`) instead of in the `fields` element. + +For instance, the following request: + +[source,sh] +--------------- +GET /my_index/my_type/1 +--------------- + +might return: + +[source,js] +--------------- +{ + "_index": "my_index", + "_type": "my_type", + "_id": "1", + "_timestamp": 10000000, <1>, + "_source": { + "foo" : [ "bar" ] + } +} +--------------- +<1> The `_timestamp` is returned by default, and at the top level. + + +==== Async replication + +The `replication` parameter has been removed from all CRUD operations +(`index`, `create`, `update`, `delete`, `bulk`) as it interfered with the +<> feature. These operations are now +synchronous only and a request will only return once the changes have been +replicated to all active shards in the shard group. + +Instead, use more client processes to send more requests in parallel. + +==== Documents must be specified without a type wrapper + +Previously, the document body could be wrapped in another object with the name +of the `type`: + +[source,js] +-------------------------- +PUT my_index/my_type/1 +{ + "my_type": { <1> + "text": "quick brown fox" + } +} +-------------------------- +<1> This `my_type` wrapper is not part of the document itself, but represents the document type. + +This feature was deprecated before but could be reenabled with the +`mapping.allow_type_wrapper` index setting. This setting is no longer +supported. The above document should be indexed as follows: + +[source,js] +-------------------------- +PUT my_index/my_type/1 +{ + "text": "quick brown fox" +} +-------------------------- + +==== Term Vectors API + +Usage of `/_termvector` is deprecated in favor of `/_termvectors`. + diff --git a/docs/reference/migration/migrate_2_0/index_apis.asciidoc b/docs/reference/migration/migrate_2_0/index_apis.asciidoc new file mode 100644 index 00000000000..ffa2e9edea8 --- /dev/null +++ b/docs/reference/migration/migrate_2_0/index_apis.asciidoc @@ -0,0 +1,42 @@ +=== Index API changes + +==== Index aliases + + +Fields used in alias filters no longer have to exist in the mapping at alias +creation time. Previously, alias filters were parsed at alias creation time +and the parsed form was cached in memory. Now, alias filters are parsed at +request time and the fields in filters are resolved from the current mapping. + +This also means that index aliases now support `has_parent` and `has_child` +queries. + +The <> will now throw an exception if no +matching aliases are found. This change brings the defaults for this API in +line with the other Indices APIs. The <> options can be used on a +request to change this behavior. + +==== File based index templates + +Index templates can no longer be configured on disk. Use the +<> API instead. + +==== Analyze API changes + + +The Analyze API now returns the the `position` of the first token as `0` +instead of `1`. + +The `prefer_local` parameter has been removed. The `_analyze` API is a light +operation and the caller shouldn't be concerned about whether it executes on +the node that receives the request or another node. + +The `text()` method on `AnalyzeRequest` now returns `String[]` instead of +`String`. + +==== Removed `id_cache` from clear cache api + +The <> API no longer supports the `id_cache` +option. Instead, use the `fielddata` option to clear the cache for the +`_parent` field. + diff --git a/docs/reference/migration/migrate_2_0/java.asciidoc b/docs/reference/migration/migrate_2_0/java.asciidoc new file mode 100644 index 00000000000..5f2d2f4834e --- /dev/null +++ b/docs/reference/migration/migrate_2_0/java.asciidoc @@ -0,0 +1,76 @@ +=== Java API changes + +==== Transport API construction + +The `TransportClient` construction code has changed, it now uses the builder +pattern. Instead of: + +[source,java] +-------------------------------------------------- +Settings settings = Settings.settingsBuilder() + .put("cluster.name", "myClusterName").build(); +Client client = new TransportClient(settings); +-------------------------------------------------- + +Use the following: + +[source,java] +-------------------------------------------------- +Settings settings = Settings.settingsBuilder() + .put("cluster.name", "myClusterName").build(); +Client client = TransportClient.builder().settings(settings).build(); +-------------------------------------------------- + +==== Automatically thread client listeners + +Previously, the user had to set request listener threads to `true` when on the +client side in order not to block IO threads on heavy operations. This proved +to be very trappy for users, and ended up creating problems that are very hard +to debug. + +In 2.0, Elasticsearch automatically threads listeners that are used from the +client when the client is a node client or a transport client. Threading can +no longer be manually set. + + +==== Query/filter refactoring + +`org.elasticsearch.index.queries.FilterBuilders` has been removed as part of the merge of +queries and filters. These filters are now available in `QueryBuilders` with the same name. +All methods that used to accept a `FilterBuilder` now accept a `QueryBuilder` instead. + +In addition some query builders have been removed or renamed: + +* `commonTerms(...)` renamed with `commonTermsQuery(...)` +* `queryString(...)` renamed with `queryStringQuery(...)` +* `simpleQueryString(...)` renamed with `simpleQueryStringQuery(...)` +* `textPhrase(...)` removed +* `textPhrasePrefix(...)` removed +* `textPhrasePrefixQuery(...)` removed +* `filtered(...)` removed. Use `filteredQuery(...)` instead. +* `inQuery(...)` removed. + +==== GetIndexRequest + +`GetIndexRequest.features()` now returns an array of Feature Enums instead of an array of String values. + +The following deprecated methods have been removed: + +* `GetIndexRequest.addFeatures(String[])` - Use + `GetIndexRequest.addFeatures(Feature[])` instead + +* `GetIndexRequest.features(String[])` - Use + `GetIndexRequest.features(Feature[])` instead. + +* `GetIndexRequestBuilder.addFeatures(String[])` - Use + `GetIndexRequestBuilder.addFeatures(Feature[])` instead. + +* `GetIndexRequestBuilder.setFeatures(String[])` - Use + `GetIndexRequestBuilder.setFeatures(Feature[])` instead. + + +==== BytesQueryBuilder removed + +The redundant BytesQueryBuilder has been removed in favour of the +WrapperQueryBuilder internally. + diff --git a/docs/reference/migration/migrate_2_0/mapping.asciidoc b/docs/reference/migration/migrate_2_0/mapping.asciidoc new file mode 100644 index 00000000000..a50fc9c6a62 --- /dev/null +++ b/docs/reference/migration/migrate_2_0/mapping.asciidoc @@ -0,0 +1,390 @@ +=== Mapping changes + +A number of changes have been made to mappings to remove ambiguity and to +ensure that conflicting mappings cannot be created. + +One major change is that dynamically added fields must have their mapping +confirmed by the master node before indexing continues. This is to avoid a +problem where different shards in the same index dynamically add different +mappings for the same field. These conflicting mappings can silently return +incorrect results and can lead to index corruption. + +This change can make indexing slower when frequently adding many new fields. +We are looking at ways of optimising this process but we chose safety over +performance for this extreme use case. + +==== Conflicting field mappings + +Fields with the same name, in the same index, in different types, must have +the same mapping, with the exception of the <>, <>, +<>, <>, <>, and <> +parameters, which may have different settings per field. + +[source,js] +--------------- +PUT my_index +{ + "mappings": { + "type_one": { + "properties": { + "name": { <1> + "type": "string" + } + } + }, + "type_two": { + "properties": { + "name": { <1> + "type": "string", + "analyzer": "english" + } + } + } + } +} +--------------- +<1> The two `name` fields have conflicting mappings and will prevent Elasticsearch + from starting. + +Elasticsearch will not start in the presence of conflicting field mappings. +These indices must be deleted or reindexed using a new mapping. + +The `ignore_conflicts` option of the put mappings API has been removed. +Conflicts can't be ignored anymore. + +==== Fields cannot be referenced by short name + +A field can no longer be referenced using its short name. Instead, the full +path to the field is required. For instance: + +[source,js] +--------------- +PUT my_index +{ + "mappings": { + "my_type": { + "properties": { + "title": { "type": "string" }, <1> + "name": { + "properties": { + "title": { "type": "string" }, <2> + "first": { "type": "string" }, + "last": { "type": "string" } + } + } + } + } + } +} +--------------- +<1> This field is referred to as `title`. +<2> This field is referred to as `name.title`. + +Previously, the two `title` fields in the example above could have been +confused with each other when using the short name `title`. + +==== Type name prefix removed + +Previously, two fields with the same name in two different types could +sometimes be disambiguated by prepending the type name. As a side effect, it +would add a filter on the type name to the relevant query. This feature was +ambiguous -- a type name could be confused with a field name -- and didn't +work everywhere e.g. aggregations. + +Instead, fields should be specified with the full path, but without a type +name prefix. If you wish to filter by the `_type` field, either specify the +type in the URL or add an explicit filter. + +The following example query in 1.x: + +[source,js] +---------------------------- +GET my_index/_search +{ + "query": { + "match": { + "my_type.some_field": "quick brown fox" + } + } +} +---------------------------- + +would be rewritten in 2.0 as: + +[source,js] +---------------------------- +GET my_index/my_type/_search <1> +{ + "query": { + "match": { + "some_field": "quick brown fox" <2> + } + } +} +---------------------------- +<1> The type name can be specified in the URL to act as a filter. +<2> The field name should be specified without the type prefix. + +==== Field names may not contain dots + +In 1.x, it was possible to create fields with dots in their name, for +instance: + +[source,js] +---------------------------- +PUT my_index +{ + "mappings": { + "my_type": { + "properties": { + "foo.bar": { <1> + "type": "string" + }, + "foo": { + "properties": { + "bar": { <1> + "type": "string" + } + } + } + } + } + } +} +---------------------------- +<1> These two fields cannot be distinguised as both are referred to as `foo.bar`. + +You can no longer create fields with dots in the name. + +==== Type names may not start with a dot + +In 1.x, Elasticsearch would issue a warning if a type name included a dot, +e.g. `my.type`. Now that type names are no longer used to distinguish between +fields in differnt types, this warning has been relaxed: type names may now +contain dots, but they may not *begin* with a dot. The only exception to this +is the special `.percolator` type. + +==== Types may no longer be deleted + +In 1.x it was possible to delete a type mapping, along with all of the +documents of that type, using the delete mapping API. This is no longer +supported, because remnants of the fields in the type could remain in the +index, causing corruption later on. + +Instead, if you need to delete a type mapping, you should reindex to a new +index which does not contain the mapping. If you just need to delete the +documents that belong to that type, then use the delete-by-query plugin +instead. + +[[migration-meta-fields]] +==== Type meta-fields + +The <> associated with had configuration options +removed, to make them more reliable: + +* `_id` configuration can no longer be changed. If you need to sort, use the <> field instead. +* `_type` configuration can no longer be changed. +* `_index` configuration can no longer be changed. +* `_routing` configuration is limited to marking routing as required. +* `_field_names` configuration is limited to disabling the field. +* `_size` configuration is limited to enabling the field. +* `_timestamp` configuration is limited to enabling the field, setting format and default value. +* `_boost` has been removed. +* `_analyzer` has been removed. + +Importantly, *meta-fields can no longer be specified as part of the document +body.* Instead, they must be specified in the query string parameters. For +instance, in 1.x, the `routing` could be specified as follows: + +[source,json] +----------------------------- +PUT my_index +{ + "mappings": { + "my_type": { + "_routing": { + "path": "group" <1> + }, + "properties": { + "group": { <1> + "type": "string" + } + } + } + } +} + +PUT my_index/my_type/1 <2> +{ + "group": "foo" +} +----------------------------- +<1> This 1.x mapping tells Elasticsearch to extract the `routing` value from the `group` field in the document body. +<2> This indexing request uses a `routing` value of `foo`. + +In 2.0, the routing must be specified explicitly: + +[source,json] +----------------------------- +PUT my_index +{ + "mappings": { + "my_type": { + "_routing": { + "required": true <1> + }, + "properties": { + "group": { + "type": "string" + } + } + } + } +} + +PUT my_index/my_type/1?routing=bar <2> +{ + "group": "foo" +} +----------------------------- +<1> Routing can be marked as required to ensure it is not forgotten during indexing. +<2> This indexing request uses a `routing` value of `bar`. + +==== Analyzer mappings + +Previously, `index_analyzer` and `search_analyzer` could be set separately, +while the `analyzer` setting would set both. The `index_analyzer` setting has +been removed in favour of just using the `analyzer` setting. + +If just the `analyzer` is set, it will be used at index time and at search time. To use a different analyzer at search time, specify both the `analyzer` and a `search_analyzer`. + +The `index_analyzer`, `search_analyzer`, and `analyzer` type-level settings +have also been removed, as is is no longer possible to select fields based on +the type name. + +The `_analyzer` meta-field, which allowed setting an analyzer per document has +also been removed. It will be ignored on older indices. + +==== Date fields and Unix timestamps + +Previously, `date` fields would first try to parse values as a Unix timestamp +-- milliseconds-since-the-epoch -- before trying to use their defined date +`format`. This meant that formats like `yyyyMMdd` could never work, as values +would be interpreted as timestamps. + +In 2.0, we have added two formats: `epoch_millis` and `epoch_second`. Only +date fields that use these formats will be able to parse timestamps. + +These formats cannot be used in dynamic templates, because they are +indistinguishable from long values. + +==== Default date format + +The default date format has changed from `date_optional_time` to +`strict_date_optional_time`, which expects a 4 digit year, and a 2 digit month +and day, (and optionally, 2 digit hour, minute, and second). + +A dynamically added date field, by default, includes the `epoch_millis` +format to support timestamp parsing. For instance: + +[source,js] +------------------------- +PUT my_index/my_type/1 +{ + "date_one": "2015-01-01" <1> +} +------------------------- +<1> Has `format`: `"strict_date_optional_time||epoch_millis"`. + +[[migration-bool-fields]] +==== Boolean fields + +Boolean fields used to have a string fielddata with `F` meaning `false` and `T` +meaning `true`. They have been refactored to use numeric fielddata, with `0` +for `false` and `1` for `true`. As a consequence, the format of the responses of +the following APIs changed when applied to boolean fields: `0`/`1` is returned +instead of `F`/`T`: + +* <> +* <> +* <> + +In addition, terms aggregations use a custom formatter for boolean (like for +dates and ip addresses, which are also backed by numbers) in order to return +the user-friendly representation of boolean fields: `false`/`true`: + +[source,js] +--------------- +"buckets": [ + { + "key": 0, + "key_as_string": "false", + "doc_count": 42 + }, + { + "key": 1, + "key_as_string": "true", + "doc_count": 12 + } +] +--------------- + +==== `index_name` and `path` removed + +The `index_name` setting was used to change the name of the Lucene field, +and the `path` setting was used on `object` fields to determine whether the +Lucene field should use the full path (including parent object fields), or +just the final `name`. + +These setting have been removed as their purpose is better served with the +<> parameter. + +==== Murmur3 Fields + +Fields of type `murmur3` can no longer change `doc_values` or `index` setting. +They are always mapped as follows: + +[source,js] +--------------------- +{ + "type": "murmur3", + "index": "no", + "doc_values": true +} +--------------------- + +==== Mappings in config files not supported + +The ability to specify mappings in configuration files has been removed. To +specify default mappings that apply to multiple indexes, use +<> instead. + +Along with this change, the following settings have ben removed: + +* `index.mapper.default_mapping_location` +* `index.mapper.default_percolator_mapping_location` + +==== Posting and doc-values codecs + +It is no longer possible to specify per-field postings and doc values formats +in the mappings. This setting will be ignored on indices created before 2.0 +and will cause mapping parsing to fail on indices created on or after 2.0. For +old indices, this means that new segments will be written with the default +postings and doc values formats of the current codec. + +It is still possible to change the whole codec by using the `index.codec` +setting. Please however note that using a non-default codec is discouraged as +it could prevent future versions of Elasticsearch from being able to read the +index. + +==== Compress and compress threshold + +The `compress` and `compress_threshold` options have been removed from the +`_source` field and fields of type `binary`. These fields are compressed by +default. If you would like to increase compression levels, use the new +<> setting instead. + + + + + diff --git a/docs/reference/migration/migrate_2_0/packaging.asciidoc b/docs/reference/migration/migrate_2_0/packaging.asciidoc new file mode 100644 index 00000000000..2d2e4365fbb --- /dev/null +++ b/docs/reference/migration/migrate_2_0/packaging.asciidoc @@ -0,0 +1,58 @@ +=== Plugin and packaging changes + +==== Symbolic links and paths + +Elasticsearch 2.0 runs with the Java security manager enabled and is much more +restrictive about which paths it is allowed to access. Various paths can be +configured, e.g. `path.data`, `path.scripts`, `path.repo`. A configured path +may itself be a symbolic link, but no symlinks under that path will be +followed (with the exception of `path.scripts`, which does follow symlinks). + +==== Running `/bin/elasticsearch` + +The command line parameter parsing has been rewritten to deal properly with +spaces in parameters. All config settings can still be specified on the +command line when starting Elasticsearch, but they must appear after the +built-in "static parameters", such as `-d` (to daemonize) and `-p` (the PID path). + +For instance: + +[source,sh] +----------- +/bin/elasticsearch -d -p /tmp/foo.pid --http.cors.enabled=true --http.cors.allow-origin='*' +----------- + +For a list of static parameters, run `/bin/elasticsearch -h` + +==== `-f` removed + +The `-f` parameter, which used to indicate that Elasticsearch should be run in +the foreground, was deprecated in 1.0 and removed in 2.0. + +==== `V` for version + +The `-v` parameter now means `--verbose` for both `bin/plugin` and +`bin/elasticsearch` (although it has no effect on the latter). To output the +version, use `-V` or `--version` instead. + +==== Plugin manager should run as root + +The permissions of the `config`, `bin`, and `plugins` directories in the RPM +and deb packages have been made more restrictive. The plugin manager should +be run as root otherwise it will not be able to install plugins. + +==== Support for official plugins + +Almost all of the official Elasticsearch plugins have been moved to the main +`elasticsearch` repository. They will be released at the same time as +Elasticsearch and have the same version number as Elasticsearch. + +Official plugins can be installed as follows: + +[source,sh] +--------------- +sudo bin/plugin install analysis-icu +--------------- + +Community-provided plugins can be installed as before. + diff --git a/docs/reference/migration/migrate_2_0/parent_child.asciidoc b/docs/reference/migration/migrate_2_0/parent_child.asciidoc new file mode 100644 index 00000000000..fe198610e51 --- /dev/null +++ b/docs/reference/migration/migrate_2_0/parent_child.asciidoc @@ -0,0 +1,43 @@ +=== Parent/Child changes + +Parent/child has been rewritten completely to reduce memory usage and to +execute `has_child` and `has_parent` queries faster and more efficient. The +`_parent` field uses doc values by default. The refactored and improved +implementation is only active for indices created on or after version 2.0. + +In order to benefit from all the performance and memory improvements, we +recommend reindexing all existing indices that use the `_parent` field. + +==== Parent type cannot pre-exist + +A mapping type is declared as a child of another mapping type by specifying +the `_parent` meta field: + +[source,js] +-------------------------- +DELETE * + +PUT my_index +{ + "mappings": { + "my_parent": {}, + "my_child": { + "_parent": { + "type": "my_parent" <1> + } + } + } +} +-------------------------- +<1> The `my_parent` type is the parent of the `my_child` type. + +The mapping for the parent type can be added at the same time as the mapping +for the child type, but cannot be added before the child type. + +==== `top_children` query removed + +The `top_children` query has been removed in favour of the `has_child` query. +It wasn't always faster than the `has_child` query and the was usually +inaccurate. The total hits and any aggregations in the same search request +would be incorrect if `top_children` was used. + diff --git a/docs/reference/migration/migrate_2_0/query_dsl.asciidoc b/docs/reference/migration/migrate_2_0/query_dsl.asciidoc new file mode 100644 index 00000000000..31283e9ce33 --- /dev/null +++ b/docs/reference/migration/migrate_2_0/query_dsl.asciidoc @@ -0,0 +1,186 @@ +=== Query DSL changes + +==== Queries and filters merged + +Queries and filters have been merged -- all filter clauses are now query +clauses. Instead, query clauses can now be used in _query context_ or in +_filter context_: + +Query context:: + +A query used in query context will caculated relevance scores and will not be +cacheable. Query context is used whenever filter context does not apply. + +Filter context:: ++ +-- + +A query used in filter context will not calculate relevance scores, and will +be cacheable. Filter context is introduced by: + +* the `constant_score` query +* the `must_not` and (newly added) `filter` parameter in the `bool` query +* the `filter` and `filters` parameters in the `function_score` query +* any API called `filter`, such as the `post_filter` search parameter, or in + aggregations or index aliases +-- + +As a result of this change, he `execution` option of the `terms` filter is now +deprecated and ignored if provided. + +==== `or` and `and` now implemented via `bool` + +The `or` and `and` filters previously had a different execution pattern to the +`bool` filter. It used to be important to use `and`/`or` with certain filter +clauses, and `bool` with others. + +This distinction has been removed: the `bool` query is now smart enough to +handle both cases optimally. As a result of this change, the `or` and `and` +filters are now sugar syntax which are executed internally as a `bool` query. +These filters may be removed in the future. + +==== `filtered` query and `query` filter deprecated + +The `query` filter is deprecated as is it no longer needed -- all queries can +be used in query or filter context. + +The `filtered` query is deprecated in favour of the `bool` query. Instead of +the following: + +[source,js] +------------------------- +GET _search +{ + "query": { + "filtered": { + "query": { + "match": { + "text": "quick brown fox" + } + }, + "filter": { + "term": { + "status": "published" + } + } + } + } +} +------------------------- + +move the query and filter to the `must` and `filter` parameters in the `bool` +query: + +[source,js] +------------------------- +GET _search +{ + "query": { + "bool": { + "must": { + "match": { + "text": "quick brown fox" + } + }, + "filter": { + "term": { + "status": "published" + } + } + } + } +} +------------------------- + +==== Filter auto-caching + +It used to be possible to control which filters were cached with the `_cache` +option and to provide a custom `_cache_key`. These options are deprecated +and, if present, will be ignored. + +Query clauses used in filter context are now auto-cached when it makes sense +to do so. The algorithm takes into account the frequency of use, the cost of +query execution, and the cost of building the filter. + +The `terms` filter lookup mechanism no longer caches the values of the +document containing the terms. It relies on the filesystem cache instead. If +the lookup index is not too large, it is recommended to replicate it to all +nodes by setting `index.auto_expand_replicas: 0-all` in order to remove the +network overhead as well. + +==== Numeric queries use IDF for scoring + +Previously, term queries on numeric fields were deliberately prevented from +using the usual Lucene scoring logic and this behaviour was undocumented and, +to some, unexpected. + +Single `term` queries on numeric fields now score in the same way as string +fields, using IDF and norms (if enabled). + +To query numeric fields without scoring, the query clause should be used in +filter context, e.g. in the `filter` parameter of the `bool` query, or wrapped +in a `constant_score` query: + +[source,js] +---------------------------- +GET _search +{ + "query": { + "bool": { + "must": [ + { + "match": { <1> + "numeric_tag": 5 + } + } + ], + "filter": [ + { + "match": { <2> + "count": 5 + } + } + ] + } + } +} +---------------------------- +<1> This clause would include IDF in the relevance score calculation. +<2> This clause would have no effect on the relevance score. + +==== Fuzziness and fuzzy-like-this + +Fuzzy matching used to calculate the score for each fuzzy alternative, meaning +that rare misspellings would have a higher score than the more common correct +spellings. Now, fuzzy matching blends the scores of all the fuzzy alternatives +to use the IDF of the most frequently occurring alternative. + +Fuzziness can no longer be specified using a percentage, but should instead +use the number of allowed edits: + +* `0`, `1`, `2`, or +* `AUTO` (which chooses `0`, `1`, or `2` based on the length of the term) + +The `fuzzy_like_this` and `fuzzy_like_this_field` queries used a very +expensive approach to fuzzy matching and have been removed. + +==== More Like This + +The More Like This (`mlt`) API and the `more_like_this_field` (`mlt_field`) +query have been removed in favor of the +<> query. + +The parameter `percent_terms_to_match` has been removed in favor of +`minimum_should_match`. + +==== `limit` filter deprecated + +The `limit` filter is deprecated and becomes a no-op. You can achieve similar +behaviour using the <> parameter. + +==== Jave plugins registering custom queries + +Java plugins that register custom queries can do so by using the +`IndicesQueriesModule#addQuery(Class)` method. Other +ways to register custom queries are not supported anymore. + diff --git a/docs/reference/migration/migrate_2_0/removals.asciidoc b/docs/reference/migration/migrate_2_0/removals.asciidoc new file mode 100644 index 00000000000..60f7422876c --- /dev/null +++ b/docs/reference/migration/migrate_2_0/removals.asciidoc @@ -0,0 +1,68 @@ +=== Removed features + +==== Rivers have been removed + +Elasticsearch does not support rivers anymore. While we had first planned to +keep them around to ease migration, keeping support for rivers proved to be +challenging as it conflicted with other important changes that we wanted to +bring to 2.0 like synchronous dynamic mappings updates, so we eventually +decided to remove them entirely. See +link:/blog/deprecating_rivers[Deprecating Rivers] for more background about +why we took this decision. + +==== Facets have been removed + +Facets, deprecated since 1.0, have now been removed. Instead, use the much +more powerful and flexible <> framework. +This also means that Kibana 3 will not work with Elasticsearch 2.0. + +==== Delete-by-query is now a plugin + +The old delete-by-query functionality was fast but unsafe. It could lead to +document differences between the primary and replica shards, and could even +produce out of memory exceptions and cause the cluster to crash. + +This feature has been reimplemented using the <> and +the <> API, which may be slower for queries which match +large numbers of documents, but is safe. + +Currently, a long running delete-by-query job cannot be cancelled, which is +one of the reasons that this functionality is only available as a plugin. You +can install the plugin with: + +[source,sh] +------------------ +./bin/plugin install delete-by-query +------------------ + + +==== `_shutdown` API + +The `_shutdown` API has been removed without a replacement. Nodes should be +managed via the operating system and the provided start/stop scripts. + +==== `_size` is now a plugin + +The `_size` meta-data field, which indexes the size in bytes of the original +JSON document, has been moved out of core and is available as a plugin. It +can be installed as: + +[source,sh] +------------------ +./bin/plugin install mapper-size +------------------ + +==== Thrift and memcached transport + +The thrift and memcached transport plugins are no longer supported. Instead, use +either the HTTP transport (enabled by default) or the node or transport Java client. + +==== Bulk UDP + +The bulk UDP API has been removed. Instead, use the standard +<> API, or use UDP to send documents to Logstash first. + +==== MergeScheduler pluggability + +The merge scheduler is no longer pluggable. + diff --git a/docs/reference/migration/migrate_2_0/scripting.asciidoc b/docs/reference/migration/migrate_2_0/scripting.asciidoc new file mode 100644 index 00000000000..4964ee05703 --- /dev/null +++ b/docs/reference/migration/migrate_2_0/scripting.asciidoc @@ -0,0 +1,102 @@ +=== Scripting changes + +==== Scripting syntax + +The syntax for scripts has been made consistent across all APIs. The accepted +format is as follows: + +Inline/Dynamic scripts:: ++ +-- + +[source,js] +--------------- +"script": { + "inline": "doc['foo'].value + val", <1> + "lang": "groovy", <2> + "params": { "val": 3 } <3> +} +--------------- +<1> The inline script to execute. +<2> The optional language of the script. +<3> Any named parameters. +-- + +Indexed scripts:: ++ +-- +[source,js] +--------------- +"script": { + "id": "my_script_id", <1> + "lang": "groovy", <2> + "params": { "val": 3 } <3> +} +--------------- +<1> The ID of the indexed script. +<2> The optional language of the script. +<3> Any named parameters. +-- + +File scripts:: ++ +-- +[source,js] +--------------- +"script": { + "file": "my_file", <1> + "lang": "groovy", <2> + "params": { "val": 3 } <3> +} +--------------- +<1> The filename of the script, without the `.lang` suffix. +<2> The optional language of the script. +<3> Any named parameters. +-- + +For example, an update request might look like this: + +[source,js] +--------------- +POST my_index/my_type/1/_update +{ + "script": { + "inline": "ctx._source.count += val", + "params": { "val": 3 } + }, + "upsert": { + "count": 0 + } +} +--------------- + +A short syntax exists for running inline scripts in the default scripting +language without any parameters: + +[source,js] +---------------- +GET _search +{ + "script_fields": { + "concat_fields": { + "script": "doc['one'].value + ' ' + doc['two'].value" + } + } +} +---------------- + +==== Scripting settings + +The `script.disable_dynamic` node setting has been replaced by fine-grained +script settings described in <>. + +==== Groovy scripts sandbox + +The Groovy sandbox and related settings have been removed. Groovy is now a +non-sandboxed scripting language, without any option to turn the sandbox on. + +==== Plugins making use of scripts + +Plugins that make use of scripts must register their own script context +through `ScriptModule`. Script contexts can be used as part of fine-grained +settings to enable/disable scripts selectively. diff --git a/docs/reference/migration/migrate_2_0/search.asciidoc b/docs/reference/migration/migrate_2_0/search.asciidoc new file mode 100644 index 00000000000..b9b5987f2e4 --- /dev/null +++ b/docs/reference/migration/migrate_2_0/search.asciidoc @@ -0,0 +1,121 @@ +=== Search changes + +==== Partial fields + +Partial fields have been removed in favor of <>. + +==== `search_type=count` deprecated + +The `count` search type has been deprecated. All benefits from this search +type can now be achieved by using the (default) `query_then_fetch` search type +and setting `size` to `0`. + +==== The count api internally uses the search api + +The count api is now a shortcut to the search api with `size` set to 0. As a +result, a total failure will result in an exception being returned rather +than a normal response with `count` set to `0` and shard failures. + +==== All stored meta-fields returned by default + +Previously, meta-fields like `_routing`, `_timestamp`, etc would only be +included in the search results if specifically requested with the `fields` +parameter. Now, all meta-fields which have stored values will be returned by +default. Additionally, they are now returned at the top level (along with +`_index`, `_type`, and `_id`) instead of in the `fields` element. + +For instance, the following request: + +[source,sh] +--------------- +GET /my_index/_search?fields=foo +--------------- + +might return: + +[source,js] +--------------- +{ + [...] + "hits": { + "total": 1, + "max_score": 1, + "hits": [ + { + "_index": "my_index", + "_type": "my_type", + "_id": "1", + "_score": 1, + "_timestamp": 10000000, <1> + "fields": { + "foo" : [ "bar" ] + } + } + ] + } +} +--------------- +<1> The `_timestamp` is returned by default, and at the top level. + + +==== Script fields + +Script fields in 1.x were only returned as a single value. Even if the return +value of a script was a list, it would be returned as an array containing an +array: + +[source,js] +--------------- +"fields": { + "my_field": [ + [ + "v1", + "v2" + ] + ] +} +--------------- + +In elasticsearch 2.0, scripts that return a list of values are treated as +multivalued fields. The same example would return the following response, with +values in a single array. + +[source,js] +--------------- +"fields": { + "my_field": [ + "v1", + "v2" + ] +} +--------------- + +==== Timezone for date field + +Specifying the `time_zone` parameter in queries or aggregations on fields of +type `date` must now be either an ISO 8601 UTC offset, or a timezone id. For +example, the value `+1:00` must now be written as `+01:00`. + +==== Only highlight queried fields + +The default value for the `require_field_match` option has changed from +`false` to `true`, meaning that the highlighters will, by default, only take +the fields that were queried into account. + +This means that, when querying the `_all` field, trying to highlight on any +field other than `_all` will produce no highlighted snippets. Querying the +same fields that need to be highlighted is the cleaner solution to get +highlighted snippets back. Otherwise `require_field_match` option can be set +to `false` to ignore field names completely when highlighting. + +The postings highlighter doesn't support the `require_field_match` option +anymore, it will only highlight fields that were queried. + +==== Postings highlighter doesn't support `match_phrase_prefix` + +The `match` query with type set to `phrase_prefix` (or the +`match_phrase_prefix` query) is not supported by the postings highlighter. No +highlighted snippets will be returned. + + + diff --git a/docs/reference/migration/migrate_2_0/settings.asciidoc b/docs/reference/migration/migrate_2_0/settings.asciidoc new file mode 100644 index 00000000000..b11fb0c0a9f --- /dev/null +++ b/docs/reference/migration/migrate_2_0/settings.asciidoc @@ -0,0 +1,125 @@ +=== Setting changes + +[[migration-script-settings]] +==== Scripting settings + +The `script.disable_dynamic` node setting has been replaced by fine-grained +script settings described in the <>. +The following setting previously used to enable dynamic or inline scripts: + +[source,yaml] +--------------- +script.disable_dynamic: false +--------------- + +It should be replaced with the following two settings in `elasticsearch.yml` that +achieve the same result: + +[source,yaml] +--------------- +script.inline: on +script.indexed: on +--------------- + +==== Units required for time and byte-sized settings + +Any settings which accept time or byte values must now be specified with +units. For instance, it is too easy to set the `refresh_interval` to 1 +*millisecond* instead of 1 second: + +[source,js] +--------------- +PUT _settings +{ + "index.refresh_interval": 1 +} +--------------- + +In 2.0, the above request will throw an exception. Instead the refresh +interval should be set to `"1s"` for one second. + +==== Shadow replica settings + +The `node.enable_custom_paths` setting has been removed and replaced by the +`path.shared_data` setting to allow shadow replicas with custom paths to work +with the security manager. For example, if your previous configuration had: + +[source,yaml] +------ +node.enable_custom_paths: true +------ + +And you created an index using shadow replicas with `index.data_path` set to +`/opt/data/my_index` with the following: + +[source,js] +-------------------------------------------------- +PUT /my_index +{ + "index": { + "number_of_shards": 1, + "number_of_replicas": 4, + "data_path": "/opt/data/my_index", + "shadow_replicas": true + } +} +-------------------------------------------------- + +For 2.0, you will need to set `path.shared_data` to a parent directory of the +index's data_path, so: + +[source,yaml] +----------- +path.shared_data: /opt/data +----------- + +==== Resource watcher settings renamed + +The setting names for configuring the resource watcher have been renamed +to prevent clashes with the watcher plugin + +* `watcher.enabled` is now `resource.reload.enabled` +* `watcher.interval` is now `resource.reload.interval` +* `watcher.interval.low` is now `resource.reload.interval.low` +* `watcher.interval.medium` is now `resource.reload.interval.medium` +* `watcher.interval.high` is now `resource.reload.interval.high` + +==== Hunspell dictionary configuration + +The parameter `indices.analysis.hunspell.dictionary.location` has been +removed, and `/hunspell` is always used. + +==== CORS allowed origins + +The CORS allowed origins setting, `http.cors.allow-origin`, no longer has a default value. Previously, the default value +was `*`, which would allow CORS requests from any origin and is considered insecure. The `http.cors.allow-origin` setting +should be specified with only the origins that should be allowed, like so: + +[source,yaml] +--------------- +http.cors.allow-origin: /https?:\/\/localhost(:[0-9]+)?/ +--------------- + +==== JSONP support + +JSONP callback support has now been removed. CORS should be used to access Elasticsearch +over AJAX instead: + +[source,yaml] +--------------- +http.cors.enabled: true +http.cors.allow-origin: /https?:\/\/localhost(:[0-9]+)?/ +--------------- + +==== In memory indices + +The `memory` / `ram` store (`index.store.type`) option was removed in +Elasticsearch. In-memory indices are no longer supported. + +==== Log messages truncated + +Log messages are now truncated at 10,000 characters. This can be changed in +the `logging.yml` configuration file with the `file.layout.conversionPattern` +setting. + +Remove mapping.date.round_ceil setting for date math parsing #8889 (issues: #8556, #8598) diff --git a/docs/reference/migration/migrate_2_0/snapshot_restore.asciidoc b/docs/reference/migration/migrate_2_0/snapshot_restore.asciidoc new file mode 100644 index 00000000000..608cd8a0797 --- /dev/null +++ b/docs/reference/migration/migrate_2_0/snapshot_restore.asciidoc @@ -0,0 +1,37 @@ +=== Snapshot and Restore changes + +==== File-system repositories must be whitelisted + +Locations of the shared file system repositories and the URL repositories with +`file:` URLs now have to be registered before starting Elasticsearch using the +`path.repo` setting. The `path.repo` setting can contain one or more +repository locations: + +[source,yaml] +--------------- +path.repo: ["/mnt/daily", "/mnt/weekly"] +--------------- + +If the repository location is specified as an absolute path it has to start +with one of the locations specified in `path.repo`. If the location is +specified as a relative path, it will be resolved against the first location +specified in the `path.repo` setting. + +==== URL repositories must be whitelisted + +URL repositories with `http:`, `https:`, and `ftp:` URLs have to be +whitelisted before starting Elasticsearch with the +`repositories.url.allowed_urls` setting. This setting supports wildcards in +the place of host, path, query, and fragment. For example: + +[source,yaml] +----------------------------------- +repositories.url.allowed_urls: ["http://www.example.org/root/*", "https://*.mydomain.com/*?*#*"] +----------------------------------- + +==== Wildcard expansion + +The obsolete parameters `expand_wildcards_open` and `expand_wildcards_close` +are no longer supported by the snapshot and restore operations. These +parameters have been replaced by a single `expand_wildcards` parameter. See +<> for more. diff --git a/docs/reference/migration/migrate_2_0/stats.asciidoc b/docs/reference/migration/migrate_2_0/stats.asciidoc new file mode 100644 index 00000000000..46f3c68a1ba --- /dev/null +++ b/docs/reference/migration/migrate_2_0/stats.asciidoc @@ -0,0 +1,57 @@ +=== Stats, info, and `cat` changes + +==== Sigar removed + +We no longer ship the Sigar library for operating system dependent statistics, +as it no longer seems to be maintained. Instead, we rely on the statistics +provided by the JVM. This has resulted in a number of changes to the node +info, and node stats responses: + +* `network.*` has been removed from nodes info and nodes stats. +* `fs.*.dev` and `fs.*.disk*` have been removed from nodes stats. +* `os.*` has been removed from nodes stats, except for `os.timestamp`, + `os.load_average`, `os.mem.*`, and `os.swap.*`. +* `os.mem.total` and `os.swap.total` have been removed from nodes info. +* `process.mem.resident` and `process.mem.share` have been removed from node stats. + +==== Removed `id_cache` from stats apis + +Removed `id_cache` metric from nodes stats, indices stats and cluster stats +apis. This metric has also been removed from the shards cat, indices cat and +nodes cat apis. Parent/child memory is now reported under fielddata, because +it has internally be using fielddata for a while now. + +To just see how much parent/child related field data is taking, the +`fielddata_fields` option can be used on the stats apis. Indices stats +example: + +[source,js] +-------------------------------------------------- +GET /_stats/fielddata?fielddata_fields=_parent +-------------------------------------------------- + +==== Percolator stats + +The total time spent running percolator queries is now called `percolate.time` +instead of `percolate.get_time`. + +==== Cluster state REST API + +The cluster state API doesn't return the `routing_nodes` section anymore when +`routing_table` is requested. The newly introduced `routing_nodes` flag can be +used separately to control whether `routing_nodes` should be returned. + +==== Index status API + +The deprecated index status API has been removed. + +==== `cat` APIs verbose by default + +The `cat` APIs now default to being verbose, which means they output column +headers by default. Verbosity can be turned off with the `v` parameter: + +[source,sh] +----------------- +GET _cat/shards?v=0 +----------------- + diff --git a/docs/reference/migration/migrate_2_0/striping.asciidoc b/docs/reference/migration/migrate_2_0/striping.asciidoc new file mode 100644 index 00000000000..7e0cc3686a5 --- /dev/null +++ b/docs/reference/migration/migrate_2_0/striping.asciidoc @@ -0,0 +1,20 @@ +=== Multiple `data.path` striping + +Previously, if the `data.path` setting listed multiple data paths, then a +shard would be ``striped'' across all paths by writing a whole file to each +path in turn (in accordance with the `index.store.distributor` setting). The +result was that files from a single segment in a shard could be spread across +multiple disks, and the failure of any one disk could corrupt multiple shards. + +This striping is no longer supported. Instead, different shards may be +allocated to different paths, but all of the files in a single shard will be +written to the same path. + +If striping is detected while starting Elasticsearch 2.0.0 or later, *all of +the files belonging to the same shard will be migrated to the same path*. If +there is not enough disk space to complete this migration, the upgrade will be +cancelled and can only be resumed once enough disk space is made available. + +The `index.store.distributor` setting has also been removed. + +