[[breaking-changes-3.0]] == Breaking changes in 3.0 This section discusses the changes that you need to be aware of when migrating your application to Elasticsearch 3.0. * <> * <> * <> * <> * <> * <> * <> * <> * <> * <> * <> * <> * <> [[breaking_30_search_changes]] === Warmers Thanks to several changes like doc values by default or disk-based norms, warmers have become quite useless. As a consequence, warmers and the warmer API have been removed: it is not possible anymore to register queries that will run before a new IndexSearcher is published. Don't worry if you have warmers defined on your indices, they will simply be ignored when upgrading to 3.0. === Search changes ==== `search_type=count` removed The `count` search type was deprecated since version 2.0.0 and is now removed. In order to get the same benefits, you just need to set the value of the `size` parameter to `0`. For instance, the following request: [source,sh] --------------- GET /my_index/_search?search_type=count { "aggs": { "my_terms": { "terms": { "field": "foo" } } } } --------------- can be replaced with: [source,sh] --------------- GET /my_index/_search { "size": 0, "aggs": { "my_terms": { "terms": { "field": "foo" } } } } --------------- ==== `search_type=scan` removed The `scan` search type was deprecated since version 2.1.0 and is now removed. All benefits from this search type can now be achieved by doing a scroll request that sorts documents in `_doc` order, for instance: [source,sh] --------------- GET /my_index/_search?scroll=2m { "sort": [ "_doc" ] } --------------- Scroll requests sorted by `_doc` have been optimized to more efficiently resume from where the previous request stopped, so this will have the same performance characteristics as the former `scan` search type. [[breaking_30_rest_api_changes]] === REST API changes ==== search exists api removed The search exists api has been removed in favour of using the search api with `size` set to `0` and `terminate_after` set to `1`. ==== `/_optimize` endpoint removed The deprecated `/_optimize` endpoint has been removed. The `/_forcemerge` endpoint should be used in lieu of optimize. The `GET` HTTP verb for `/_forcemerge` is no longer supported, please use the `POST` HTTP verb. ==== Deprecated queries removed The following deprecated queries have been removed: * `filtered`: use `bool` query instead, which supports `filter` clauses too * `and`: use `must` clauses in a `bool` query instead * `or`: use should clauses in a `bool` query instead * `limit`: use `terminate_after` parameter instead * `fquery`: obsolete after filters and queries have been merged * `query`: obsolete after filters and queries have been merged ==== Unified fuzziness parameter * Removed support for the deprecated `min_similarity` parameter in `fuzzy query`, in favour of `similarity`. * Removed support for the deprecated `fuzzy_min_sim` parameter in `query_string` query, in favour of `similarity`. * Removed support for the deprecated `edit_distance` parameter in completion suggester, in favour of `similarity`. ==== indices query Removed support for the deprecated `filter` and `no_match_filter` fields in `indices` query, in favour of `query` and `no_match_query`. ==== nested query Removed support for the deprecated `filter` fields in `nested` query, in favour of `query`. ==== terms query Removed support for the deprecated `minimum_should_match` and `disable_coord` in `terms` query, use `bool` query instead. Removed also support for the deprecated `execution` parameter. ==== function_score query Removed support for the top level `filter` element in `function_score` query, replaced by `query`. ==== highlighters Removed support for multiple highlighter names, the only supported ones are: `plain`, `fvh` and `postings`. ==== top level filter Removed support for the deprecated top level `filter` in the search api, replaced by `post_filter`. ==== `query_binary` and `filter_binary` removed Removed support for the undocumented `query_binary` and `filter_binary` sections of a search request. ==== `span_near`'s' `collect_payloads` deprecated Payloads are now loaded when needed. [[breaking_30_parent_child_changes]] === Parent/Child changes The `children` aggregation, parent child inner hits and `has_child` and `has_parent` queries will not work on indices with `_parent` field mapping created before version `2.0.0`. The data of these indices need to be re-indexed into a new index. The format of the join between parent and child documents have changed with the `2.0.0` release. The old format can't read from version `3.0.0` and onwards. The new format allows for a much more efficient and scalable join between parent and child documents and the join data structures are stored on on disk data structures as opposed as before the join data structures were stored in the jvm heap space. ==== `score_type` has been removed The `score_type` option has been removed from the `has_child` and `has_parent` queries in favour of the `score_mode` option which does the exact same thing. ==== `sum` score mode removed The `sum` score mode has been removed in favour of the `total` mode which does the same and is already available in previous versions. ==== `max_children` option When `max_children` was set to `0` on the `has_child` query then there was no upper limit on how many children documents are allowed to match. This has changed and `0` now really means to zero child documents are allowed. If no upper limit is needed then the `max_children` option shouldn't be defined at all on the `has_child` query. ==== `_parent` field no longer indexed The join between parent and child documents no longer relies on indexed fields and therefor from `3.0.0` onwards the `_parent` indexed field won't be indexed. In order to find documents that referrer to a specific parent id the new `parent_id` query can be used. The get response and hits inside the search response remain to include the parent id under the `_parent` key. [[breaking_30_settings_changes]] === Settings changes ==== Analysis settings The `index.analysis.analyzer.default_index` analyzer is not supported anymore. If you wish to change the analyzer to use for indexing, change the `index.analysis.analyzer.default` analyzer instead. ==== Ping timeout settings Previously, there were three settings for the ping timeout: `discovery.zen.initial_ping_timeout`, `discovery.zen.ping.timeout` and `discovery.zen.ping_timeout`. The former two have been removed and the only setting key for the ping timeout is now `discovery.zen.ping_timeout`. The default value for ping timeouts remains at three seconds. ==== Recovery settings Recovery settings deprecated in 1.x have been removed: * `index.shard.recovery.translog_size` is superseded by `indices.recovery.translog_size` * `index.shard.recovery.translog_ops` is superseded by `indices.recovery.translog_ops` * `index.shard.recovery.file_chunk_size` is superseded by `indices.recovery.file_chunk_size` * `index.shard.recovery.concurrent_streams` is superseded by `indices.recovery.concurrent_streams` * `index.shard.recovery.concurrent_small_file_streams` is superseded by `indices.recovery.concurrent_small_file_streams` * `indices.recovery.max_size_per_sec` is superseded by `indices.recovery.max_bytes_per_sec` If you are using any of these settings please take the time and review their purpose. All of the settings above are considered _expert settings_ and should only be used if absolutely necessary. If you have set any of the above setting as persistent cluster settings please use the settings update API and set their superseded keys accordingly. The following settings have been removed without replacement * `indices.recovery.concurrent_small_file_streams` - recoveries are now single threaded. The number of concurrent outgoing recoveries are throttled via allocation deciders * `indices.recovery.concurrent_file_streams` - recoveries are now single threaded. The number of concurrent outgoing recoveries are throttled via allocation deciders ==== Translog settings The `index.translog.flush_threshold_ops` setting is not supported anymore. In order to control flushes based on the transaction log growth use `index.translog.flush_threshold_size` instead. Changing the translog type with `index.translog.fs.type` is not supported anymore, the `buffered` implementation is now the only available option and uses a fixed `8kb` buffer. ==== Request Cache Settings The deprecated settings `index.cache.query.enable` and `indices.cache.query.size` have been removed and are replaced with `index.requests.cache.enable` and `indices.requests.cache.size` respectively. ==== Allocation settings Allocation settings deprecated in 1.x have been removed: * `cluster.routing.allocation.concurrent_recoveries` is superseded by `cluster.routing.allocation.node_concurrent_recoveries` Please change the setting in your configuration files or in the clusterstate to use the new settings instead. ==== Similarity settings The 'default' similarity has been renamed to 'classic'. ==== Indexing settings `indices.memory.min_shard_index_buffer_size` and `indices.memory.max_shard_index_buffer_size` are removed since Elasticsearch now allows any one shard to any amount of heap as long as the total indexing buffer heap used across all shards is below the node's `indices.memory.index_buffer_size` (default: 10% of the JVM heap) [[breaking_30_mapping_changes]] === Mapping changes ==== Default doc values settings Doc values are now also on by default on numeric and boolean fields that are not indexed. ==== Transform removed The `transform` feature from mappings has been removed. It made issues very hard to debug. ==== Default number mappings When a floating-point number is encountered, it is now dynamically mapped as a float by default instead of a double. The reasoning is that floats should be more than enough for most cases but would decrease storage requirements significantly. ==== `_source`'s `format` option The `_source` mapping does not support the `format` option anymore. This option will still be accepted for indices created before the upgrade to 3.0 for backward compatibility, but it will have no effect. Indices created on or after 3.0 will reject this option. ==== Object notation Core types don't support the object notation anymore, which allowed to provide values as follows: [source,json] ----- { "value": "field_value", "boost": 42 } ---- [[breaking_30_plugins]] === Plugin changes Plugins implementing custom queries need to implement the `fromXContent(QueryParseContext)` method in their `QueryParser` subclass rather than `parse`. This method will take care of parsing the query from `XContent` format into an intermediate query representation that can be streamed between the nodes in binary format, effectively the query object used in the java api. Also, the query parser needs to implement the `getBuilderPrototype` method that returns a prototype of the `NamedWriteable` query, which allows to deserialize an incoming query by calling `readFrom(StreamInput)` against it, which will create a new object, see usages of `Writeable`. The `QueryParser` also needs to declare the generic type of the query that it supports and it's able to parse. The query object can then transform itself into a lucene query through the new `toQuery(QueryShardContext)` method, which returns a lucene query to be executed on the data node. Similarly, plugins implementing custom score functions need to implement the `fromXContent(QueryParseContext)` method in their `ScoreFunctionParser` subclass rather than `parse`. This method will take care of parsing the function from `XContent` format into an intermediate function representation that can be streamed between the nodes in binary format, effectively the function object used in the java api. Also, the query parser needs to implement the `getBuilderPrototype` method that returns a prototype of the `NamedWriteable` function, which allows to deserialize an incoming function by calling `readFrom(StreamInput)` against it, which will create a new object, see usages of `Writeable`. The `ScoreFunctionParser` also needs to declare the generic type of the function that it supports and it's able to parse. The function object can then transform itself into a lucene function through the new `toFunction(QueryShardContext)` method, which returns a lucene function to be executed on the data node. ==== Cloud AWS plugin changes Cloud AWS plugin has been split in two plugins: * {plugins}/discovery-ec2.html[Discovery EC2 plugin] * {plugins}/repository-s3.html[Repository S3 plugin] Proxy settings for both plugins have been renamed: * from `cloud.aws.proxy_host` to `cloud.aws.proxy.host` * from `cloud.aws.ec2.proxy_host` to `cloud.aws.ec2.proxy.host` * from `cloud.aws.s3.proxy_host` to `cloud.aws.s3.proxy.host` * from `cloud.aws.proxy_port` to `cloud.aws.proxy.port` * from `cloud.aws.ec2.proxy_port` to `cloud.aws.ec2.proxy.port` * from `cloud.aws.s3.proxy_port` to `cloud.aws.s3.proxy.port` ==== Cloud Azure plugin changes Cloud Azure plugin has been split in three plugins: * {plugins}/discovery-azure.html[Discovery Azure plugin] * {plugins}/repository-azure.html[Repository Azure plugin] * {plugins}/store-smb.html[Store SMB plugin] If you were using the `cloud-azure` plugin for snapshot and restore, you had in `elasticsearch.yml`: [source,yaml] ----- cloud: azure: storage: account: your_azure_storage_account key: your_azure_storage_key ----- You need to give a unique id to the storage details now as you can define multiple storage accounts: [source,yaml] ----- cloud: azure: storage: my_account: account: your_azure_storage_account key: your_azure_storage_key ----- ==== Cloud GCE plugin changes Cloud GCE plugin has been renamed to {plugins}/discovery-gce.html[Discovery GCE plugin]. [[breaking_30_java_api_changes]] === Java API changes ==== Count api has been removed The deprecated count api has been removed from the Java api, use the search api instead and set size to 0. The following call [source,java] ----- client.prepareCount(indices).setQuery(query).get(); ----- can be replaced with [source,java] ----- client.prepareSearch(indices).setSource(new SearchSourceBuilder().size(0).query(query)).get(); ----- ==== BoostingQueryBuilder Removed setters for mandatory positive/negative query. Both arguments now have to be supplied at construction time already and have to be non-null. ==== SpanContainingQueryBuilder Removed setters for mandatory big/little inner span queries. Both arguments now have to be supplied at construction time already and have to be non-null. Updated static factory methods in QueryBuilders accordingly. ==== SpanOrQueryBuilder Making sure that query contains at least one clause by making initial clause mandatory in constructor. ==== SpanNearQueryBuilder Removed setter for mandatory slop parameter, needs to be set in constructor now. Also making sure that query contains at least one clause by making initial clause mandatory in constructor. Updated the static factory methods in QueryBuilders accordingly. ==== SpanNotQueryBuilder Removed setter for mandatory include/exclude span query clause, needs to be set in constructor now. Updated the static factory methods in QueryBuilders and tests accordingly. ==== SpanWithinQueryBuilder Removed setters for mandatory big/little inner span queries. Both arguments now have to be supplied at construction time already and have to be non-null. Updated static factory methods in QueryBuilders accordingly. ==== QueryFilterBuilder Removed the setter `queryName(String queryName)` since this field is not supported in this type of query. Use `FQueryFilterBuilder.queryName(String queryName)` instead when in need to wrap a named query as a filter. ==== WrapperQueryBuilder Removed `wrapperQueryBuilder(byte[] source, int offset, int length)`. Instead simply use `wrapperQueryBuilder(byte[] source)`. Updated the static factory methods in QueryBuilders accordingly. ==== QueryStringQueryBuilder Removed ability to pass in boost value using `field(String field)` method in form e.g. `field^2`. Use the `field(String, float)` method instead. ==== Operator Removed the enums called `Operator` from `MatchQueryBuilder`, `QueryStringQueryBuilder`, `SimpleQueryStringBuilder`, and `CommonTermsQueryBuilder` in favour of using the enum defined in `org.elasticsearch.index.query.Operator` in an effort to consolidate the codebase and avoid duplication. ==== queryName and boost support Support for `queryName` and `boost` has been streamlined to all of the queries. That is a breaking change till queries get sent over the network as serialized json rather than in `Streamable` format. In fact whenever additional fields are added to the json representation of the query, older nodes might throw error when they find unknown fields. ==== InnerHitsBuilder InnerHitsBuilder now has a dedicated addParentChildInnerHits and addNestedInnerHits methods to differentiate between inner hits for nested vs. parent / child documents. This change makes the type / path parameter mandatory. ==== MatchQueryBuilder Moving MatchQueryBuilder.Type and MatchQueryBuilder.ZeroTermsQuery enum to MatchQuery.Type. Also reusing new Operator enum. ==== MoreLikeThisQueryBuilder Removed `MoreLikeThisQueryBuilder.Item#id(String id)`, `Item#doc(BytesReference doc)`, `Item#doc(XContentBuilder doc)`. Use provided constructors instead. Removed `MoreLikeThisQueryBuilder#addLike` in favor of texts and/or items being provided at construction time. Using arrays there instead of lists now. Removed `MoreLikeThisQueryBuilder#addUnlike` in favor to using the `unlike` methods which take arrays as arguments now rather than the lists used before. The deprecated `docs(Item... docs)`, `ignoreLike(Item... docs)`, `ignoreLike(String... likeText)`, `addItem(Item... likeItems)` have been removed. ==== GeoDistanceQueryBuilder Removing individual setters for lon() and lat() values, both values should be set together using point(lon, lat). ==== GeoDistanceRangeQueryBuilder Removing setters for to(Object ...) and from(Object ...) in favour of the only two allowed input arguments (String, Number). Removing setter for center point (point(), geohash()) because parameter is mandatory and should already be set in constructor. Also removing setters for lt(), lte(), gt(), gte() since they can all be replaced by equivallent calls to to/from() and inludeLower()/includeUpper(). ==== GeoPolygonQueryBuilder Require shell of polygon already to be specified in constructor instead of adding it pointwise. This enables validation, but makes it necessary to remove the addPoint() methods. ==== MultiMatchQueryBuilder Moving MultiMatchQueryBuilder.ZeroTermsQuery enum to MatchQuery.ZeroTermsQuery. Also reusing new Operator enum. Removed ability to pass in boost value using `field(String field)` method in form e.g. `field^2`. Use the `field(String, float)` method instead. ==== MissingQueryBuilder The MissingQueryBuilder which was deprecated in 2.2.0 is removed. As a replacement use ExistsQueryBuilder inside a mustNot() clause. So instead of using `new ExistsQueryBuilder(name)` now use `new BoolQueryBuilder().mustNot(new ExistsQueryBuilder(name))`. ==== NotQueryBuilder The NotQueryBuilder which was deprecated in 2.1.0 is removed. As a replacement use BoolQueryBuilder with added mustNot() clause. So instead of using `new NotQueryBuilder(filter)` now use `new BoolQueryBuilder().mustNot(filter)`. ==== TermsQueryBuilder Remove the setter for `termsLookup()`, making it only possible to either use a TermsLookup object or individual values at construction time. Also moving individual settings for the TermsLookup (lookupIndex, lookupType, lookupId, lookupPath) to the separate TermsLookup class, using constructor only and moving checks for validation there. Removed `TermsLookupQueryBuilder` in favour of `TermsQueryBuilder`. ==== FunctionScoreQueryBuilder `add` methods have been removed, all filters and functions must be provided as constructor arguments by creating an array of `FunctionScoreQueryBuilder.FilterFunctionBuilder` objects, containing one element for each filter/function pair. `scoreMode` and `boostMode` can only be provided using corresponding enum members instead of string values: see `FilterFunctionScoreQuery.ScoreMode` and `CombineFunction`. `CombineFunction.MULT` has been renamed to `MULTIPLY`. ==== IdsQueryBuilder For simplicity, only one way of adding the ids to the existing list (empty by default) is left: `addIds(String...)` ==== DocumentAlreadyExistsException removed `DocumentAlreadyExistsException` is removed and a `VersionConflictException` is thrown instead (with a better error description). This will influence code that use the `IndexRequest.opType()` or `IndexRequest.create()` to index a document only if it doesn't already exist. ==== ShapeBuilders `InternalLineStringBuilder` is removed in favour of `LineStringBuilder`, `InternalPolygonBuilder` in favour of PolygonBuilder` and `Ring` has been replaced with `LineStringBuilder`. Also the abstract base classes `BaseLineStringBuilder` and `BasePolygonBuilder` haven been merged with their corresponding implementations. ==== RescoreBuilder `RecoreBuilder.Rescorer` was merged with `RescoreBuilder`, which now is an abstract superclass. QueryRescoreBuilder currently is its only implementation. [[breaking_30_cache_concurrency]] === Cache concurrency level settings removed Two cache concurrency level settings `indices.requests.cache.concurrency_level` and `indices.fielddata.cache.concurrency_level` because they no longer apply to the cache implementation used for the request cache and the field data cache. [[breaking_30_non_loopback]] === Remove bind option of `non_loopback` This setting would arbitrarily pick the first interface not marked as loopback. Instead, specify by address scope (e.g. `_local_,_site_` for all loopback and private network addresses) or by explicit interface names, hostnames, or addresses. [[breaking_30_thread_pool]] === Forbid changing of thread pool types Previously, <> could be dynamically adjusted. The thread pool type effectively controls the backing queue for the thread pool and modifying this is an expert setting with minimal practical benefits and high risk of being misused. The ability to change the thread pool type for any thread pool has been removed; do note that it is still possible to adjust relevant thread pool parameters for each of the thread pools (e.g., depending on the thread pool type, `keep_alive`, `queue_size`, etc.). [[breaking_30_cpu_stats]] === System CPU stats The recent CPU usage (as a percent) has been added to the OS stats reported under the node stats API and the cat nodes API. The breaking change here is that there is a new object in the `os` object in the node stats response. This object is called `cpu` and includes "percent" and `load_average` as fields. This moves the `load_average` field that was previously a top-level field in the `os` object to the `cpu` object. The format of the `load_average` field has changed to an object with fields `1m`, `5m`, and `15m` representing the one-minute, five-minute and fifteen-minute loads respectively. If any of these fields are not present, it indicates that the corresponding value is not available. In the cat nodes API response, the `cpu` field is output by default. The previous `load` field has been removed and is replaced by `load_1m`, `load_5m`, and `load_15m` which represent the one-minute, five-minute and fifteen-minute loads respectively. The field will be null if the corresponding value is not available. Finally, the API for `org.elasticsearch.monitor.os.OsStats` has changed. The `getLoadAverage` method has been removed. The value for this can now be obtained from `OsStats.Cpu#getLoadAverage` but it is no longer a double and is instead an object encapsulating the one-minute, five-minute and fifteen-minute load averages. Additionally, the recent CPU usage can be obtained from `OsStats.Cpu#getPercent`. === Fields option Only stored fields are retrievable with this option. The fields option won't be able to load non stored fields from _source anymore. [[breaking_30_allocation]] === Primary shard allocation Previously, primary shards were only assigned if a quorum of shard copies were found (configurable using `index.recovery.initial_shards`, now deprecated). In case where a primary had only a single replica, quorum was defined to be a single shard. This meant that any shard copy of an index with replication factor 1 could become primary, even it was a stale copy of the data on disk. This is now fixed by using allocation IDs. Allocation IDs assign unique identifiers to shard copies. This allows the cluster to differentiate between multiple copies of the same data and track which shards have been active, so that after a cluster restart, shard copies containing only the most recent data can become primaries. === Reroute commands The reroute command `allocate` has been split into two distinct commands `allocate_replica` and `allocate_empty_primary`. This was done as we introduced a new `allocate_stale_primary` command. The new `allocate_replica` command corresponds to the old `allocate` command with `allow_primary` set to false. The new `allocate_empty_primary` command corresponds to the old `allocate` command with `allow_primary` set to true. ==== `index.shared_filesystem.recover_on_any_node` changes The behavior of `index.shared_filesystem.recover_on_any_node = true` has been changed. Previously, in the case where no shard copies could be found, an arbitrary node was chosen by potentially ignoring allocation deciders. Now, we take balancing into account but don't assign the shard if the allocation deciders are not satisfied. The behavior has also changed in the case where shard copies can be found. Previously, a node not holding the shard copy was chosen if none of the nodes holding shard copies were satisfying the allocation deciders. Now, the shard will be assigned to a node having a shard copy, even if none of the nodes holding a shard copy satisfy the allocation deciders. [[breaking_30_percolator]] === Percolator Adding percolator queries and modifications to existing percolator queries are no longer visible in immediately to the percolator. A refresh is required to run before the changes are visible to the percolator. The reason that this has changed is that on newly created indices the percolator automatically indexes the query terms and these query terms are used at percolate time to reduce the amount of queries the percolate API needs evaluate. This optimization didn't work in the percolate API mode where modifications to queries are immediately visible. The percolator by defaults sets the `size` option to `10` whereas before this was set to unlimited. The percolate api can no longer accept documents that have fields that don't exist in the mapping. When percolating an existing document then specifying a document in the source of the percolate request is not allowed any more. The percolate api no longer modifies the mappings. Before the percolate api could be used to dynamically introduce new fields to the mappings based on the fields in the document being percolated. This no longer works, because these unmapped fields are not persisted in the mapping. Percolator documents are no longer excluded from the search response. [[breaking_30_packaging]] === Packaging ==== Default logging using systemd (since Elasticsearch 2.2.0) In previous versions of Elasticsearch, the default logging configuration routed standard output to /dev/null and standard error to the journal. However, there are often critical error messages at startup that are logged to standard output rather than standard error and these error messages would be lost to the nether. The default has changed to now route standard output to the journal and standard error to inherit this setting (these are the defaults for systemd). These settings can be modified by editing the elasticsearch.service file.