2014-09-30 05:54:50 -04:00
[[breaking-changes-2.0]]
== Breaking changes in 2.0
This section discusses the changes that you need to be aware of when migrating
your application to Elasticsearch 2.0.
2015-06-03 14:21:12 -04:00
=== Networking
Elasticsearch now binds to the loopback interface by default (usually 127.0.0.1
or ::1), the setting `network.host` can be specified to change this behavior.
2015-06-10 03:05:39 -04:00
=== Rivers removal
Elasticsearch does not support rivers anymore. While we had first planned to
keep them around to ease migration, keeping support for rivers proved to be
challenging as it conflicted with other important changes that we wanted to
bring to 2.0 like synchronous dynamic mappings updates, so we eventually
decided to remove them entirely. See
https://www.elastic.co/blog/deprecating_rivers for more background about why
we are moving away from rivers.
2014-09-30 05:54:50 -04:00
=== Indices API
2015-02-16 10:54:06 -05:00
The <<alias-retrieving, get alias api>> will, by default produce an error response
if a requested index does not exist. This change brings the defaults for this API in
line with the other Indices APIs. The <<multi-index>> options can be used on a request
2014-10-17 07:24:04 -04:00
to change this behavior
2015-01-09 10:20:05 -05:00
`GetIndexRequest.features()` now returns an array of Feature Enums instead of an array of String values.
2015-02-25 08:07:19 -05:00
2014-11-07 11:39:42 -05:00
The following deprecated methods have been removed:
2015-02-25 08:07:19 -05:00
2014-11-07 11:39:42 -05:00
* `GetIndexRequest.addFeatures(String[])` - Please use `GetIndexRequest.addFeatures(Feature[])` instead
* `GetIndexRequest.features(String[])` - Please use `GetIndexRequest.features(Feature[])` instead
* `GetIndexRequestBuilder.addFeatures(String[])` - Please use `GetIndexRequestBuilder.addFeatures(Feature[])` instead
* `GetIndexRequestBuilder.setFeatures(String[])` - Please use `GetIndexRequestBuilder.setFeatures(Feature[])` instead
2014-10-17 07:24:04 -04:00
=== Partial fields
Partial fields were deprecated since 1.0.0beta1 in favor of <<search-request-source-filtering,source filtering>>.
2014-10-27 10:15:04 -04:00
2015-05-05 11:51:24 -04:00
=== More Like This
2014-10-27 10:15:04 -04:00
2015-05-05 11:51:24 -04:00
The More Like This API and the More Like This Field query have been removed in
favor of the <<query-dsl-mlt-query, More Like This Query>>.
Switch to murmurhash3 to route documents to shards.
We currently use the djb2 hash function in order to compute the shard a
document should go to. Unfortunately this hash function is not very
sophisticated and you can sometimes hit adversarial cases, such as numeric ids
on 33 shards.
Murmur3 generates hashes with a better distribution, which should avoid the
adversarial cases.
Here are some examples of how 100000 incremental ids are distributed to shards
using either djb2 or murmur3.
5 shards:
Murmur3: [19933, 19964, 19940, 20030, 20133]
DJB: [20000, 20000, 20000, 20000, 20000]
3 shards:
Murmur3: [33185, 33347, 33468]
DJB: [30100, 30000, 39900]
33 shards:
Murmur3: [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977]
DJB: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0]
Even if djb2 looks ideal in some cases (5 shards), the fact that the
distribution of its hashes has some patterns can raise issues with some shard
counts (eg. 3, or even worse 33).
Some tests have been modified because they relied on implementation details of
the routing hash function.
Close #7954
2014-10-01 18:34:05 -04:00
2015-05-07 04:14:11 -04:00
The parameter `percent_terms_to_match` has been removed in favor of
`minimum_should_match`.
Switch to murmurhash3 to route documents to shards.
We currently use the djb2 hash function in order to compute the shard a
document should go to. Unfortunately this hash function is not very
sophisticated and you can sometimes hit adversarial cases, such as numeric ids
on 33 shards.
Murmur3 generates hashes with a better distribution, which should avoid the
adversarial cases.
Here are some examples of how 100000 incremental ids are distributed to shards
using either djb2 or murmur3.
5 shards:
Murmur3: [19933, 19964, 19940, 20030, 20133]
DJB: [20000, 20000, 20000, 20000, 20000]
3 shards:
Murmur3: [33185, 33347, 33468]
DJB: [30100, 30000, 39900]
33 shards:
Murmur3: [2999, 3096, 2930, 2986, 3070, 3093, 3023, 3052, 3112, 2940, 3036, 2985, 3031, 3048, 3127, 2961, 2901, 3105, 3041, 3130, 3013, 3035, 3031, 3019, 3008, 3022, 3111, 3086, 3016, 2996, 3075, 2945, 2977]
DJB: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 900, 900, 900, 900, 1000, 1000, 10000, 10000, 10000, 10000, 9100, 9100, 9100, 9100, 9000, 9000, 0, 0, 0, 0, 0, 0]
Even if djb2 looks ideal in some cases (5 shards), the fact that the
distribution of its hashes has some patterns can raise issues with some shard
counts (eg. 3, or even worse 33).
Some tests have been modified because they relied on implementation details of
the routing hash function.
Close #7954
2014-10-01 18:34:05 -04:00
=== Routing
The default hash function that is used for routing has been changed from djb2 to
murmur3. This change should be transparent unless you relied on very specific
properties of djb2. This will help ensure a better balance of the document counts
between shards.
In addition, the following node settings related to routing have been deprecated:
[horizontal]
`cluster.routing.operation.hash.type`::
This was an undocumented setting that allowed to configure which hash function
to use for routing. `murmur3` is now enforced on new indices.
`cluster.routing.operation.use_type`::
This was an undocumented setting that allowed to take the `_type` of the
document into account when computing its shard (default: `false`). `false` is
now enforced on new indices.
2014-11-18 09:13:28 -05:00
2015-03-19 10:25:52 -04:00
=== Async replication
The `replication` parameter has been removed from all CRUD operations (index,
2015-08-01 05:14:46 -04:00
update, delete, bulk). These operations are now synchronous
2015-03-19 10:25:52 -04:00
only, and a request will only return once the changes have been replicated to
all active shards in the shard group.
2014-11-18 09:13:28 -05:00
=== Store
The `memory` / `ram` store (`index.store.type`) option was removed in Elasticsearch 2.0.
2014-11-13 09:05:09 -05:00
=== Term Vectors API
Usage of `/_termvector` is deprecated, and replaced in favor of `/_termvectors`.
2014-11-21 10:42:37 -05:00
=== Script fields
Script fields in 1.x were only returned as a single value. So even if the return
value of a script used to be list, it would be returned as an array containing
a single value that is a list too, such as:
2015-07-14 12:14:09 -04:00
[source,js]
2014-11-21 10:42:37 -05:00
---------------
"fields": {
"my_field": [
[
"v1",
"v2"
]
]
}
---------------
In elasticsearch 2.x, scripts that return a list of values are considered as
multivalued fields. So the same example would return the following response,
with values in a single array.
2015-07-14 12:14:09 -04:00
[source,js]
2014-11-21 10:42:37 -05:00
---------------
"fields": {
"my_field": [
"v1",
"v2"
]
}
---------------
2014-11-26 09:24:23 -05:00
2014-12-10 05:17:46 -05:00
=== Main API
Previously, calling `GET /` was giving back the http status code within the json response
in addition to the actual HTTP status code. We removed `status` field in json response.
2014-11-26 09:24:23 -05:00
=== Java API
2015-05-05 02:27:52 -04:00
`org.elasticsearch.index.queries.FilterBuilders` has been removed as part of the merge of
queries and filters. These filters are now available in `QueryBuilders` with the same name.
All methods that used to accept a `FilterBuilder` now accept a `QueryBuilder` instead.
In addition some query builders have been removed or renamed:
2014-11-26 09:24:23 -05:00
* `commonTerms(...)` renamed with `commonTermsQuery(...)`
* `queryString(...)` renamed with `queryStringQuery(...)`
* `simpleQueryString(...)` renamed with `simpleQueryStringQuery(...)`
* `textPhrase(...)` removed
* `textPhrasePrefix(...)` removed
* `textPhrasePrefixQuery(...)` removed
* `filtered(...)` removed. Use `filteredQuery(...)` instead.
* `inQuery(...)` removed.
2015-02-25 08:10:17 -05:00
=== Aggregations
2015-01-09 10:20:05 -05:00
2015-02-16 10:54:06 -05:00
The `date_histogram` aggregation now returns a `Histogram` object in the response, and the `DateHistogram` class has been removed. Similarly
the `date_range`, `ipv4_range`, and `geo_distance` aggregations all return a `Range` object in the response, and the `IPV4Range`, `DateRange`,
and `GeoDistance` classes have been removed. The motivation for this is to have a single response API for the Range and Histogram aggregations
regardless of the type of data being queried. To support this some changes were made in the `MultiBucketAggregation` interface which applies
2015-01-09 10:20:05 -05:00
to all bucket aggregations:
2015-02-16 10:54:06 -05:00
* The `getKey()` method now returns `Object` instead of `String`. The actual object type returned depends on the type of aggregation requested
2015-01-09 10:20:05 -05:00
(e.g. the `date_histogram` will return a `DateTime` object for this method whereas a `histogram` will return a `Number`).
* A `getKeyAsString()` method has been added to return the String representation of the key.
* All other `getKeyAsX()` methods have been removed.
* The `getBucketAsKey(String)` methods have been removed on all aggregations except the `filters` and `terms` aggregations.
2015-02-03 08:06:50 -05:00
The `histogram` and the `date_histogram` aggregation now support a simplified `offset` option that replaces the previous `pre_offset` and
`post_offset` rounding options. Instead of having to specify two separate offset shifts of the underlying buckets, the `offset` option
moves the bucket boundaries in positive or negative direction depending on its argument.
2015-02-16 10:54:06 -05:00
The `date_histogram` options for `pre_zone` and `post_zone` are replaced by the `time_zone` option. The behavior of `time_zone` is
equivalent to the former `pre_zone` option. Setting `time_zone` to a value like "+01:00" now will lead to the bucket calculations
being applied in the specified time zone but In addition to this, also the `pre_zone_adjust_large_interval` is removed because we
now always return dates and bucket keys in UTC.
2015-04-30 08:55:34 -04:00
Both the `histogram` and `date_histogram` aggregations now have a default `min_doc_count` of `0` instead of `1` previously.
2015-04-03 12:04:22 -04:00
`include`/`exclude` filtering on the `terms` aggregation now uses the same syntax as regexp queries instead of the Java syntax. While simple
regexps should still work, more complex ones might need some rewriting. Also, the `flags` parameter is not supported anymore.
2014-12-24 05:27:45 -05:00
=== Terms filter lookup caching
The terms filter lookup mechanism does not support the `cache` option anymore
and relies on the filesystem cache instead. If the lookup index is not too
large, it is recommended to make it replicated to all nodes by setting
`index.auto_expand_replicas: 0-all` in order to remove the network overhead as
well.
2014-11-26 09:24:23 -05:00
2015-04-13 10:35:06 -04:00
=== Delete by query
2014-09-24 08:54:50 -04:00
The meaning of the `_shards` headers in the delete by query response has changed. Before version 2.0 the `total`,
`successful` and `failed` fields in the header are based on the number of primary shards. The failures on replica
shards aren't being kept track of. From version 2.0 the stats in the `_shards` header are based on all shards
of an index. The http status code is left unchanged and is only based on failures that occurred while executing on
2015-01-07 16:00:07 -05:00
primary shards.
2015-03-18 00:20:57 -04:00
=== Delete api with missing routing when required
Delete api requires a routing value when deleting a document belonging to a type that has routing set to required in its
mapping, whereas previous elasticsearch versions would trigger a broadcast delete on all shards belonging to the index.
A `RoutingMissingException` is now thrown instead.
2015-01-07 16:00:07 -05:00
=== Mappings
2015-01-29 19:45:20 -05:00
* The setting `index.mapping.allow_type_wrapper` has been removed. Documents should always be sent without the type as the root element.
2015-03-23 20:28:02 -04:00
* The delete mappings API has been removed. Mapping types can no longer be deleted.
2015-07-30 12:06:28 -04:00
* Mapping type names can no longer start with dots.
2015-05-18 08:47:22 -04:00
* The `ignore_conflicts` option of the put mappings API has been removed. Conflicts can't be ignored anymore.
2015-05-21 09:12:43 -04:00
* The `binary` field does not support the `compress` and `compress_threshold` options anymore.
2015-01-29 19:45:20 -05:00
==== Removed type prefix on field names in queries
Types can no longer be specified on fields within queries. Instead, specify type restrictions in the search request.
The following is an example query in 1.x over types `t1` and `t2`:
2015-02-25 08:13:25 -05:00
2015-07-14 12:14:09 -04:00
[source,js]
2015-01-29 19:45:20 -05:00
---------------
curl -XGET 'localhost:9200/index/_search'
{
"query": {
"bool": {
"should": [
{"match": { "t1.field_only_in_t1": "foo" }},
{"match": { "t2.field_only_in_t2": "bar" }}
]
}
}
}
---------------
In 2.0, the query should look like the following:
2015-02-25 08:11:57 -05:00
2015-07-14 12:14:09 -04:00
[source,js]
2015-01-29 19:45:20 -05:00
---------------
curl -XGET 'localhost:9200/index/t1,t2/_search'
{
"query": {
"bool": {
"should": [
{"match": { "field_only_in_t1": "foo" }},
{"match": { "field_only_in_t2": "bar" }}
]
}
}
}
---------------
2015-01-09 10:20:05 -05:00
2015-02-12 01:55:34 -05:00
==== Removed short name field access
Field names in queries, aggregations, etc. must now use the complete name. Use of the short name
caused ambiguities in field lookups when the same name existed within multiple object mappings.
The following example illustrates the difference between 1.x and 2.0.
Given these mappings:
2015-02-25 08:13:25 -05:00
2015-07-14 12:14:09 -04:00
[source,js]
2015-02-12 01:55:34 -05:00
---------------
curl -XPUT 'localhost:9200/index'
{
"mappings": {
"type": {
"properties": {
"name": {
"type": "object",
"properties": {
"first": {"type": "string"},
"last": {"type": "string"}
}
}
}
}
}
}
---------------
The following query was possible in 1.x:
2015-02-25 08:13:25 -05:00
2015-07-14 12:14:09 -04:00
[source,js]
2015-02-12 01:55:34 -05:00
---------------
curl -XGET 'localhost:9200/index/type/_search'
{
"query": {
"match": { "first": "foo" }
}
}
---------------
In 2.0, the same query should now be:
2015-02-25 08:13:25 -05:00
2015-07-14 12:14:09 -04:00
[source,js]
2015-02-12 01:55:34 -05:00
---------------
curl -XGET 'localhost:9200/index/type/_search'
{
"query": {
"match": { "name.first": "foo" }
}
}
---------------
2015-07-21 11:35:28 -04:00
==== Removed support for `.` in field name mappings
Prior to Elasticsearch 2.0, a field could be defined to have a `.` in its name.
Mappings like the one below have been deprecated for some time and they will be
blocked in Elasticsearch 2.0.
[source,js]
---------------
curl -XPUT 'localhost:9200/index'
{
"mappings": {
"type": {
"properties": {
"name.first": {
"type": "string"
}
}
}
}
}
---------------
2015-02-26 04:24:17 -05:00
==== Meta fields have limited configuration
2015-02-24 04:20:13 -05:00
Meta fields (those beginning with underscore) are fields used by elasticsearch
to provide special features. They now have limited configuration options.
* `_id` configuration can no longer be changed. If you need to sort, use `_uid` instead.
2015-02-24 18:56:46 -05:00
* `_type` configuration can no longer be changed.
2015-07-21 02:54:35 -04:00
* `_index` configuration can no longer be changed.
2015-02-26 03:41:50 -05:00
* `_routing` configuration is limited to requiring the field.
2015-02-26 04:24:17 -05:00
* `_boost` has been removed.
2015-02-26 16:47:53 -05:00
* `_field_names` configuration is limited to disabling the field.
2015-02-26 17:02:24 -05:00
* `_size` configuration is limited to enabling the field.
2015-06-22 03:16:53 -04:00
* `_timestamp` configuration is limited to enabling the field, setting format and default value
2015-02-24 04:20:13 -05:00
2015-05-09 03:03:23 -04:00
==== Meta fields in documents
Meta fields can no longer be specified within a document. They should be specified
2015-05-11 16:07:24 -04:00
via the API. For example, instead of adding a field `_parent` within a document,
use the `parent` url parameter when indexing that document.
2015-05-09 03:03:23 -04:00
2015-07-16 22:50:39 -04:00
==== Default date format now is `strictDateOptionalTime`
2015-07-07 03:30:45 -04:00
Instead of `dateOptionalTime` the new default date format now is `strictDateOptionalTime`,
which is more strict in parsing dates. This means, that dates now need to have a four digit year,
a two-digit month, day, hour, minute and second. This means, you may need to preprend a part of the date
with a zero to make it conform or switch back to the old `dateOptionalTime` format.
2015-06-22 05:56:31 -04:00
==== Date format does not support unix timestamps by default
In earlier versions of elasticsearch, every timestamp was always tried to be parsed as
as unix timestamp first. This means, even when specifying a date format like
`dateOptionalTime`, one could supply unix timestamps instead of a ISO8601 formatted
date.
This is not supported anymore. If you want to store unix timestamps, you need to specify
the appropriate formats in the mapping, namely `epoch_second` or `epoch_millis`.
2015-06-25 10:59:52 -04:00
In addition the `numeric_resolution` mapping parameter is ignored. Use the
`epoch_second` and `epoch_millis` date formats instead.
2015-05-01 01:58:53 -04:00
==== Source field limitations
The `_source` field could previously be disabled dynamically. Since this field
is a critical piece of many features like the Update API, it is no longer
possible to disable.
The options for `compress` and `compress_threshold` have also been removed.
The source field is already compressed. To minimize the storage cost,
set `index.codec: best_compression` in index settings.
2015-04-26 22:21:55 -04:00
==== Boolean fields
2014-10-02 10:16:54 -04:00
Boolean fields used to have a string fielddata with `F` meaning `false` and `T`
meaning `true`. They have been refactored to use numeric fielddata, with `0`
for `false` and `1` for `true`. As a consequence, the format of the responses of
the following APIs changed when applied to boolean fields: `0`/`1` is returned
instead of `F`/`T`:
- <<search-request-fielddata-fields,fielddata fields>>
- <<search-request-sort,sort values>>
- <<search-aggregations-bucket-terms-aggregation,terms aggregations>>
In addition, terms aggregations use a custom formatter for boolean (like for
dates and ip addresses, which are also backed by numbers) in order to return
the user-friendly representation of boolean fields: `false`/`true`:
2015-07-14 12:14:09 -04:00
[source,js]
2014-10-02 10:16:54 -04:00
---------------
"buckets": [
{
"key": 0,
"key_as_string": "false",
"doc_count": 42
},
{
"key": 1,
"key_as_string": "true",
"doc_count": 12
}
]
---------------
2015-04-26 22:21:55 -04:00
==== Murmur3 Fields
2015-04-22 16:27:15 -04:00
Fields of type `murmur3` can no longer change `doc_values` or `index` setting.
They are always stored with doc values, and not indexed.
2015-04-26 22:21:55 -04:00
==== Source field configuration
2015-05-05 04:03:15 -04:00
The `_source` field no longer supports `includes` and `excludes` parameters. When
2015-04-26 22:21:55 -04:00
`_source` is enabled, the entire original source will be stored.
2015-04-29 03:29:32 -04:00
==== Config based mappings
The ability to specify mappings in configuration files has been removed. To specify
default mappings that apply to multiple indexes, use index templates.
The following settings are no longer valid:
2015-07-21 11:35:28 -04:00
2015-04-29 03:29:32 -04:00
* `index.mapper.default_mapping_location`
* `index.mapper.default_percolator_mapping_location`
2015-02-18 03:48:48 -05:00
=== Codecs
It is no longer possible to specify per-field postings and doc values formats
in the mappings. This setting will be ignored on indices created before
elasticsearch 2.0 and will cause mapping parsing to fail on indices created on
or after 2.0. For old indices, this means that new segments will be written
with the default postings and doc values formats of the current codec.
It is still possible to change the whole codec by using the `index.codec`
setting. Please however note that using a non-default codec is discouraged as
it could prevent future versions of Elasticsearch from being able to read the
index.
2015-02-23 07:08:06 -05:00
2015-03-26 15:26:40 -04:00
=== Scripting settings
Removed support for `script.disable_dynamic` node setting, replaced by
fine-grained script settings described in the <<enable-dynamic-scripting,scripting docs>>.
The following setting previously used to enable dynamic scripts:
[source,yaml]
---------------
script.disable_dynamic: false
---------------
can be replaced with the following two settings in `elasticsearch.yml` that
achieve the same result:
[source,yaml]
---------------
script.inline: on
script.indexed: on
---------------
=== Script parameters
2015-02-23 07:08:06 -05:00
2015-07-14 08:33:37 -04:00
Deprecated script parameters `id`, `file`, `scriptField`, `script_id`, `script_file`,
`script`, `lang` and `params`. The <<modules-scripting,new script API syntax>> should be used in their place.
2015-03-19 15:49:58 -04:00
2015-07-14 08:34:11 -04:00
The deprecated script parameters have been removed from the Java API so applications using the Java API will
need to be updated.
2015-04-08 07:34:49 -04:00
=== Groovy scripts sandbox
The groovy sandbox and related settings have been removed. Groovy is now a non
sandboxed scripting language, without any option to turn the sandbox on.
Scripting: remove deprecated methods from ScriptService
Removed the following methods from `ScriptService`, which don't require the `ScriptContext` argument:
```
public CompiledScript compile(String lang, String script, ScriptType scriptType)
public ExecutableScript executable(String lang, String script, ScriptType scriptType, Map<String, Object> vars)
public SearchScript search(SearchLookup lookup, String lang, String script, ScriptType scriptType, @Nullable Map<String, Object> vars)
```
Also removed the ScriptContext.Standard.GENERIC_PLUGIN enum value, as it was used only for backwards compatibility.
Plugins that make use of scripts should declare their own script contexts through `ScriptModule#registerScriptContext` and use them when compiling/executing scripts.
Closes #10476
2015-04-08 06:12:15 -04:00
=== Plugins making use of scripts
Plugins that make use of scripts must register their own script context through
`ScriptModule`. Script contexts can be used as part of fine-grained settings to
enable/disable scripts selectively.
2015-03-19 15:49:58 -04:00
=== Thrift and memcached transport
The thrift and memcached transport plugins are no longer supported. Instead, use
either the HTTP transport (enabled by default) or the node or transport Java client.
2015-01-14 05:19:32 -05:00
=== `search_type=count` deprecation
The `count` search type has been deprecated. All benefits from this search type can
now be achieved by using the `query_then_fetch` search type (which is the
default) and setting `size` to `0`.
2015-01-11 05:01:51 -05:00
2015-05-16 06:58:27 -04:00
=== The count api internally uses the search api
The count api is now a shortcut to the search api with `size` set to 0. As a
result, a total failure will result in an exception being returned rather
than a normal response with `count` set to `0` and shard failures.
2015-01-11 05:01:51 -05:00
=== JSONP support
JSONP callback support has now been removed. CORS should be used to access Elasticsearch
over AJAX instead:
[source,yaml]
---------------
http.cors.enabled: true
http.cors.allow-origin: /https?:\/\/localhost(:[0-9]+)?/
---------------
2015-04-08 09:46:49 -04:00
2015-07-08 15:24:14 -04:00
=== CORS allowed origins
The CORS allowed origins setting, `http.cors.allow-origin`, no longer has a default value. Previously, the default value
was `*`, which would allow CORS requests from any origin and is considered insecure. The `http.cors.allow-origin` setting
should be specified with only the origins that should be allowed, like so:
[source,yaml]
---------------
http.cors.allow-origin: /https?:\/\/localhost(:[0-9]+)?/
---------------
2015-04-08 09:46:49 -04:00
=== Cluster state REST api
The cluster state api doesn't return the `routing_nodes` section anymore when
`routing_table` is requested. The newly introduced `routing_nodes` flag can
be used separately to control whether `routing_nodes` should be returned.
2015-04-09 12:33:27 -04:00
2015-04-02 05:42:36 -04:00
=== Query DSL
2015-05-05 05:17:05 -04:00
Change to ranking behaviour: single-term queries on numeric fields now score in the same way as string fields (use of IDF, norms if enabled).
2015-04-24 11:30:12 -04:00
Previously, term queries on numeric fields were deliberately prevented from using the usual Lucene scoring logic and this behaviour was undocumented and, to some, unexpected.
2015-05-05 05:17:05 -04:00
If the introduction of scoring to numeric fields is undesirable for your query clauses the fix is simple: wrap them in a `constant_score` or use a `filter` expression instead.
2015-04-24 11:30:12 -04:00
2015-05-13 06:04:56 -04:00
The `filtered` query is deprecated. Instead you should use a `bool` query with
a `must` clause for the query and a `filter` clause for the filter. For instance
the below query:
2015-07-14 12:14:09 -04:00
[source,js]
2015-05-13 06:04:56 -04:00
---------------
{
"filtered": {
"query": {
// query
},
"filter": {
// filter
}
}
}
---------------
can be replaced with
2015-07-14 12:14:09 -04:00
[source,js]
2015-05-13 06:04:56 -04:00
---------------
{
"bool": {
"must": {
// query
},
"filter": {
// filter
}
}
}
---------------
and will produce the same scores.
2015-04-24 11:30:12 -04:00
2015-04-02 05:42:36 -04:00
The `fuzzy_like_this` and `fuzzy_like_this_field` queries have been removed.
2015-04-10 09:41:56 -04:00
The `limit` filter is deprecated and becomes a no-op. You can achieve similar
behaviour using the <<search-request-body,terminate_after>> parameter.
2015-04-09 12:33:27 -04:00
`or` and `and` on the one hand and `bool` on the other hand used to have
different performance characteristics depending on the wrapped filters. This is
fixed now, as a consequence the `or` and `and` filters are now deprecated in
favour or `bool`.
The `execution` option of the `terms` filter is now deprecated and ignored if
provided.
2015-02-26 11:30:02 -05:00
2015-04-24 14:59:22 -04:00
The `_cache` and `_cache_key` parameters of filters are deprecated in the REST
layer and removed in the Java API. In case they are specified they will be
ignored. Instead filters are always used as their own cache key and elasticsearch
makes decisions by itself about whether it should cache filters based on how
often they are used.
2015-06-03 04:45:29 -04:00
Java plugins that register custom queries can do so by using the
`IndicesQueriesModule#addQuery(Class<? extends QueryParser>)` method. Other
ways to register custom queries are not supported anymore.
2015-05-05 02:27:52 -04:00
==== Query/filter merge
Elasticsearch no longer makes a difference between queries and filters in the
DSL; it detects when scores are not needed and automatically optimizes the
query to not compute scores and optionally caches the result.
As a consequence the `query` filter serves no purpose anymore and is deprecated.
2015-07-21 03:18:30 -04:00
=== Timezone for date field
Specifying the `time_zone` parameter on queries or aggregations of `date` type fields
must now be either an ISO 8601 UTC offset, or a timezone id. For example, the value
`+1:00` must now be `+01:00`.
2015-02-26 11:30:02 -05:00
=== Snapshot and Restore
2015-07-14 18:37:52 -04:00
Locations of the shared file system repositories and the URL repositories with `file:` URLs has to be now registered
using `path.repo` setting. The `path.repo` setting can contain one or more repository locations:
2015-05-13 20:03:50 -04:00
[source,yaml]
---------------
path.repo: ["/mnt/daily", "/mnt/weekly"]
---------------
2015-07-14 18:37:52 -04:00
If the repository location is specified as an absolute path it has to start with one of the locations
2015-05-13 20:03:50 -04:00
specified in `path.repo`. If the location is specified as a relative path, it will be resolved against the first
location specified in the `path.repo` setting.
2015-07-14 18:37:52 -04:00
URL repositories with `http:`, `https:`, and `ftp:` URLs has to be whitelisted by specifying allowed URLs in the
`repositories.url.allowed_urls` setting. This setting supports wildcards in the place of host, path, query, and
fragment. For example:
[source,yaml]
-----------------------------------
repositories.url.allowed_urls: ["http://www.example.org/root/*", "https://*.mydomain.com/*?*#*"]
-----------------------------------
2015-02-26 11:30:02 -05:00
The obsolete parameters `expand_wildcards_open` and `expand_wildcards_close` are no longer
supported by the snapshot and restore operations. These parameters have been replaced by
a single `expand_wildcards` parameter. See <<multi-index,the multi-index docs>> for more.
2015-04-27 11:16:23 -04:00
=== `_shutdown` API
The `_shutdown` API has been removed without a replacement. Nodes should be managed via operating
systems and the provided start/stop scripts.
2015-04-24 05:00:32 -04:00
=== Analyze API
2015-05-15 11:25:53 -04:00
* The Analyze API return 0 as first Token's position instead of 1.
* The `text()` method on `AnalyzeRequest` now returns `String[]` instead of `String`.
2015-04-24 05:00:32 -04:00
2015-04-24 04:53:03 -04:00
=== Multiple data.path striping
Previously, if the `data.path` setting listed multiple data paths, then a
shard would be ``striped'' across all paths by writing a whole file to each
path in turn (in accordance with the `index.store.distributor` setting). The
result was that the files from a single segment in a shard could be spread
across multiple disks, and the failure of any one disk could corrupt multiple
shards.
This striping is no longer supported. Instead, different shards may be
allocated to different paths, but all of the files in a single shard will be
written to the same path.
If striping is detected while starting Elasticsearch 2.0.0 or later, all of
the files belonging to the same shard will be migrated to the same path. If
there is not enough disk space to complete this migration, the upgrade will be
cancelled and can only be resumed once enough disk space is made available.
The `index.store.distributor` setting has also been removed.
2015-05-04 10:00:05 -04:00
=== Hunspell dictionary configuration
The parameter `indices.analysis.hunspell.dictionary.location` has been removed,
2015-05-05 05:17:05 -04:00
and `<path.conf>/hunspell` is always used.
2015-05-04 10:23:08 -04:00
=== Java API Transport API construction
The `TransportClient` construction code has changed, it now uses the builder
pattern. Instead of using:
[source,java]
--------------------------------------------------
2015-06-24 01:53:32 -04:00
Settings settings = Settings.settingsBuilder()
2015-05-04 10:23:08 -04:00
.put("cluster.name", "myClusterName").build();
Client client = new TransportClient(settings);
2015-05-05 05:17:05 -04:00
--------------------------------------------------
2015-05-04 10:23:08 -04:00
Use:
[source,java]
--------------------------------------------------
2015-06-24 01:53:32 -04:00
Settings settings = Settings.settingsBuilder()
2015-05-04 10:23:08 -04:00
.put("cluster.name", "myClusterName").build();
Client client = TransportClient.builder().settings(settings).build();
--------------------------------------------------
2015-05-07 17:53:29 -04:00
=== Logging
Log messages are now truncated at 10,000 characters. This can be changed in the
`logging.yml` configuration file.
2015-05-07 03:59:12 -04:00
[float]
=== Removed `top_children` query
The `top_children` query has been removed in favour of the `has_child` query. The `top_children` query wasn't always faster
than the `has_child` query and the `top_children` query was often inaccurate. The total hits and any aggregations in the
2015-05-08 02:05:12 -04:00
same search request will likely be off if `top_children` was used.
=== Removed file based index templates
Index templates can no longer be configured on disk. Use the `_template` API instead.
2015-05-15 08:06:18 -04:00
[float]
=== Removed `id_cache` from stats apis
Removed `id_cache` metric from nodes stats, indices stats and cluster stats apis. This metric has also been removed
from the shards cat, indices cat and nodes cat apis. Parent/child memory is now reported under fielddata, because it
has internally be using fielddata for a while now.
To just see how much parent/child related field data is taking, the `fielddata_fields` option can be used on the stats
apis. Indices stats example:
[source,js]
--------------------------------------------------
curl -XGET "http://localhost:9200/_stats/fielddata?pretty&human&fielddata_fields=_parent"
--------------------------------------------------
Parent/child is using field data for the `_parent` field since version `1.1.0`, but the memory stats for the `_parent`
field were still shown under `id_cache` metric in the stats apis for backwards compatible reasons between 1.x versions.
Before version `1.1.0` the parent/child had its own in-memory data structures for id values in the `_parent` field.
[float]
=== Removed `id_cache` from clear cache api
Removed `id_cache` option from the clear cache apis. The `fielddata` option should be used to clear `_parent` field
from fielddata.
2015-05-09 08:17:41 -04:00
[float]
=== Highlighting
2015-05-08 14:12:55 -04:00
The default value for the `require_field_match` option is `true` rather than
`false`, meaning that the highlighters will take the fields that were queried
into account by default. That means for instance that highlighting any field
when querying the `_all` field will produce no highlighted snippets by default,
given that the match was on the `_all` field only. Querying the same fields
that need to be highlighted is the cleaner solution to get highlighted snippets
back. Otherwise `require_field_match` option can be set to `false` to ignore
field names completely when highlighting.
2015-05-09 08:17:41 -04:00
The postings highlighter doesn't support the `require_field_match` option
anymore, it will only highlight fields that were queried.
The `match` query with type set to `match_phrase_prefix` is not supported by the
postings highlighter. No highlighted snippets will be returned.
2015-05-08 14:12:55 -04:00
2015-03-25 05:42:49 -04:00
[float]
=== Parent/child
Parent/child has been rewritten completely to reduce memory usage and to execute
`has_child` and `has_parent` queries faster and more efficient. The `_parent` field
uses doc values by default. The refactored and improved implementation is only active
for indices created on or after version 2.0.
In order to benefit for all performance and memory improvements we recommend to reindex all
indices that have the `_parent` field created before was upgraded to 2.0.
The following breaks in backwards compatability have been made on indices with the `_parent` field
created on or after clusters with version 2.0:
* The `type` option on the `_parent` field can only point to a parent type that doesn't exist yet,
so this means that an existing type/mapping can no longer become a parent type.
2015-06-03 14:21:12 -04:00
* The `has_child` and `has_parent` queries can no longer be use in alias filters.
Rest: Add all meta fields to the top level json document.
Some of our meta fields (such as _id, _version, ...) are returned as top-level
properties of the json document, while other properties (_timestamp, _routing,
...) are returned under `fields`. This commit makes all meta fields returned
as top-level properties.
So eg. `GET test/test/1?fields=_timestamp,foo` would now return
```json
{
"_index": "test",
"_type": "test",
"_id": "1",
"_version": 1,
"_timestamp": 10000000,
"found": true,
"fields": {
"foo": [ "bar" ]
}
}
```
while it used to return
```json
{
"_index": "test",
"_type": "test",
"_id": "1",
"_version": 1,
"found": true,
"fields": {
"_timestamp": 10000000,
"foo": [ "bar" ]
}
}
```
2014-10-17 06:35:20 -04:00
=== Meta fields returned under the top-level json object
When selecting meta fields such as `_routing` or `_timestamp`, the field values
are now directly put as a top-level property of the json objet, instead of being
put under `fields` like regular stored fields.
[source,sh]
---------------
curl -XGET 'localhost:9200/test/_search?fields=_timestamp,foo'
---------------
2015-07-14 12:14:09 -04:00
[source,js]
Rest: Add all meta fields to the top level json document.
Some of our meta fields (such as _id, _version, ...) are returned as top-level
properties of the json document, while other properties (_timestamp, _routing,
...) are returned under `fields`. This commit makes all meta fields returned
as top-level properties.
So eg. `GET test/test/1?fields=_timestamp,foo` would now return
```json
{
"_index": "test",
"_type": "test",
"_id": "1",
"_version": 1,
"_timestamp": 10000000,
"found": true,
"fields": {
"foo": [ "bar" ]
}
}
```
while it used to return
```json
{
"_index": "test",
"_type": "test",
"_id": "1",
"_version": 1,
"found": true,
"fields": {
"_timestamp": 10000000,
"foo": [ "bar" ]
}
}
```
2014-10-17 06:35:20 -04:00
---------------
{
[...]
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "test",
"_id": "1",
"_score": 1,
"_timestamp": 10000000,
"fields": {
"foo" : [ "bar" ]
}
}
]
}
}
---------------
2015-06-09 03:45:18 -04:00
=== Settings for resource watcher have been renamed
The setting names for configuring the resource watcher have been renamed
to prevent clashes with the watcher plugin
* `watcher.enabled` is now `resource.reload.enabled`
* `watcher.interval` is now `resource.reload.interval`
* `watcher.interval.low` is now `resource.reload.interval.low`
* `watcher.interval.medium` is now `resource.reload.interval.medium`
* `watcher.interval.high` is now `resource.reload.interval.high`
2015-06-30 12:44:58 -04:00
=== Percolator stats
2015-07-07 03:30:45 -04:00
Changed the `percolate.getTime` stat (total time spent on percolating) to `percolate.time` state.
Simplify Plugin Manager for official plugins
Plugin Manager can now use another simplified form when a user wants to install an official plugin hosted at elasticsearch download service.
The form we use is:
```sh
bin/plugin install pluginname
```
As plugins share now the same version as elasticsearch, we can automatically guess what is the exact current version of the plugin manager script.
Also, download service will now use `/org.elasticsearch.plugins/pluginName/pluginName-version.zip` URL path to download a plugin.
If the older form is provided (`user/plugin/version` or `user/plugin`), we will still use:
* elasticsearch download service at `/user/plugin/plugin-version.zip`
* maven central with groupIp=user, artifactId=plugin and version=version
* github with user=user, repoName=plugin and tag=version
* github with user=user, repoName=plugin and branch=master if no version is set
Note that community plugin providers can use other download services by using `--url` option.
If you try to use the new form with a non core elasticsearch plugin, the plugin manager will reject
it and will give you all known core plugins.
```
Usage:
-u, --url [plugin location] : Set exact URL to download the plugin from
-i, --install [plugin name] : Downloads and installs listed plugins [*]
-t, --timeout [duration] : Timeout setting: 30s, 1m, 1h... (infinite by default)
-r, --remove [plugin name] : Removes listed plugins
-l, --list : List installed plugins
-v, --verbose : Prints verbose messages
-s, --silent : Run in silent mode
-h, --help : Prints this help message
[*] Plugin name could be:
elasticsearch-plugin-name for Elasticsearch 2.0 Core plugin (download from download.elastic.co)
elasticsearch/plugin/version for elasticsearch commercial plugins (download from download.elastic.co)
groupId/artifactId/version for community plugins (download from maven central or oss sonatype)
username/repository for site plugins (download from github master)
Elasticsearch Core plugins:
- elasticsearch-analysis-icu
- elasticsearch-analysis-kuromoji
- elasticsearch-analysis-phonetic
- elasticsearch-analysis-smartcn
- elasticsearch-analysis-stempel
- elasticsearch-cloud-aws
- elasticsearch-cloud-azure
- elasticsearch-cloud-gce
- elasticsearch-delete-by-query
- elasticsearch-lang-javascript
- elasticsearch-lang-python
```
2015-06-22 07:22:54 -04:00
=== Plugin Manager for official plugins
Some of the elasticsearch official plugins have been moved to elasticsearch repository and will be released at the
same time as elasticsearch itself, using the same version number.
In that case, the plugin manager can now use a simpler form to identify an official plugin. Instead of:
[source,sh]
---------------
bin/plugin install elasticsearch/plugin_name/version
---------------
You can use:
[source,sh]
---------------
bin/plugin install plugin_name
---------------
The plugin manager will recognize this form and will be able to download the right version for your elasticsearch
version.
For older versions of elasticsearch, you still have to use the older form.
For the record, official plugins which can use this new simplified form are:
* elasticsearch-analysis-icu
* elasticsearch-analysis-kuromoji
* elasticsearch-analysis-phonetic
* elasticsearch-analysis-smartcn
* elasticsearch-analysis-stempel
* elasticsearch-cloud-aws
* elasticsearch-cloud-azure
* elasticsearch-cloud-gce
* elasticsearch-delete-by-query
* elasticsearch-lang-javascript
* elasticsearch-lang-python
Startup: Remove getopt parsing in shell script, use java CLITool
In order to ensure, we have the same experience across operating systems
and shells, this commit uses the java CLI parser instead of the shell
getopt parsing to parse arguments.
This also allows for support for paths, which contain spaces.
Also commons-cli depdency was upgraded to 1.3.1 and tests have been added.
Changes
* new exit code, OK_AND_EXIT, allowing to tell the caller to exit, as everything
went as expected (e.g. when running a version output)
BWC breaking:
* execute() returns an ExitStatus instead of an integer, otherwise there is no
possibility to signal by a command, if the JVM should be exited after a run.
This affects plugins, that have command line tools
* -v used to be version, but is a verbose flag by default in the current CLI infra,
must be -V or --version now
* -X has been removed - the current implementation was useless anyway, as
it prefixed those properties with "es.". You should use
ES_JAVA_OPTS/JAVA_OPTS for JVM configuration
2015-07-30 04:28:18 -04:00
=== `/bin/elasticsearch` version needs `-V` parameter
Due to switching to elasticsearchs internal command line parsing
infrastructure for the pluginmanager and the elasticsearch start up
script, the `-v` parameter now stands for `--verbose`, where as `-V` or
`--version` can be used to show the Elasticsearch version and exit.
2015-08-10 04:41:11 -04:00
=== `/bin/elasticsearch` dynamic parameters must come after static ones
If you are setting configuration options like cluster name or node name via
the commandline, you have to ensure, that the static options like pid file
path or daemonizing always come first, like this
```
/bin/elasticsearch -d -p /tmp/foo.pid --http.cors.enabled=true --http.cors.allow-origin='*'
```
For a list of those static parameters, run `/bin/elasticsearch -h`
2015-07-09 06:07:49 -04:00
=== Aliases
Fields used in alias filters no longer have to exist in the mapping upon alias creation time. Alias filters are now
parsed at request time and then the fields in filters are resolved from the mapping, whereas before alias filters were
2015-07-16 22:50:39 -04:00
parsed at alias creation time and the parsed form was kept around in memory.
2015-07-21 10:44:58 -04:00
=== _analyze API
The `prefer_local` has been removed from the _analyze api. The _analyze api is a light operation and the caller shouldn't
be concerned about whether it executes on the node that receives the request or another node.