During master election each node pings in order to discover other nodes and validate the liveness of existing nodes. Based on this information the node either discovers an existing master or, if enough nodes are found (based on `discovery.zen.minimum_master_nodes>>) a new master will be elected.
Currently, the node that is elected as master will currently update it the cluster state to indicate the result of the election. Other nodes will submit a join request to the newly elected master node. Instead of immediately processing the election result, the elected master
node should wait for the incoming joins from other nodes, thus validating the elections result is properly applied. As soon as enough nodes have sent their joins request (based on the `minimum_master_nodes` settings) the cluster state is modified.
Note that if `minimum_master_nodes` is not set, this change has no effect.
Closes#12161
Require urls for URL repository to be listed in repositories.url.allowed_urls setting. This change ensures that only authorized URLs can be accessed by elasticsearch
Lucene deprecated this in 4.0 and we only try best effort to support it.
Folks should only use edit distance rather than some length based
similarity. Yet the formular is simple enough such that users can
still do it in the client if they really need to.
Closes#10638
Change the default delayed allocation timeout from 0 (no delayed allocation) to 1m. The value came from a test of having a node with 50 shards being indexed into (so beefy translog requiring flush on shutdown), then shutting it down and starting it back up and waiting for it to join the cluster. This took, on a slow machine, about 30s.
The value is conservatively low and does not try to address a virtual machine / OS restart for now, in order to not have the affect of node going away and users being concerned that shards are not being allocated to the rest of the cluster as a result of that. The setting can always be changed in order to increase the delayed allocation if needed.
closes#12166
This PR is a simple doc patch to explicitly mention with an example of
how to create an alias using a glob pattern. This comes up from
time-to-time with our customers and in the community and although
mentioned in the documentation already, is not obvious.
Also mention that the alias will not auto-update as indices matching the
glob change.
Closes#12175Closes#12176
ignore_above is used to guard against the lucene limitation
that a term cannot exceed 32766 bytes.
However, the implementation just used the character count, which
doesn't take into account the fact that some characters have
multi-byte utf-8 encodings.
This commit updates the docs to make this relationship clear.
Closes#11563
Add support for retrieving fields in bulk updates
This commit adds support to retrieve fields when using the bulk update API. This functionality was previously available for the update API
but not for the bulk update API.
Closes#11527
This commit adds support to retrieve fields when using the bulk update API. This functionality was previously available for the update API
but not for the bulk update API.
Closes#11527
Fixed documentation since the default rewrite method for fuzzy queries is to
select top terms, fixed usage of the fuzzy rewrite method, and removed unused
`rewrite` parameter.
Close#6932
This rewrite method is interesting because it computes scores as if all terms
had the same frequencies, which avoids disappointments with ranking when a fuzzy
query ranks typos first given that they are less frequent than the correct term.
Plugin Manager can now use another simplified form when a user wants to install an official plugin hosted at elasticsearch download service.
The form we use is:
```sh
bin/plugin install pluginname
```
As plugins share now the same version as elasticsearch, we can automatically guess what is the exact current version of the plugin manager script.
Also, download service will now use `/org.elasticsearch.plugins/pluginName/pluginName-version.zip` URL path to download a plugin.
If the older form is provided (`user/plugin/version` or `user/plugin`), we will still use:
* elasticsearch download service at `/user/plugin/plugin-version.zip`
* maven central with groupIp=user, artifactId=plugin and version=version
* github with user=user, repoName=plugin and tag=version
* github with user=user, repoName=plugin and branch=master if no version is set
Note that community plugin providers can use other download services by using `--url` option.
If you try to use the new form with a non core elasticsearch plugin, the plugin manager will reject
it and will give you all known core plugins.
```
Usage:
-u, --url [plugin location] : Set exact URL to download the plugin from
-i, --install [plugin name] : Downloads and installs listed plugins [*]
-t, --timeout [duration] : Timeout setting: 30s, 1m, 1h... (infinite by default)
-r, --remove [plugin name] : Removes listed plugins
-l, --list : List installed plugins
-v, --verbose : Prints verbose messages
-s, --silent : Run in silent mode
-h, --help : Prints this help message
[*] Plugin name could be:
elasticsearch-plugin-name for Elasticsearch 2.0 Core plugin (download from download.elastic.co)
elasticsearch/plugin/version for elasticsearch commercial plugins (download from download.elastic.co)
groupId/artifactId/version for community plugins (download from maven central or oss sonatype)
username/repository for site plugins (download from github master)
Elasticsearch Core plugins:
- elasticsearch-analysis-icu
- elasticsearch-analysis-kuromoji
- elasticsearch-analysis-phonetic
- elasticsearch-analysis-smartcn
- elasticsearch-analysis-stempel
- elasticsearch-cloud-aws
- elasticsearch-cloud-azure
- elasticsearch-cloud-gce
- elasticsearch-delete-by-query
- elasticsearch-lang-javascript
- elasticsearch-lang-python
```
This pipeline aggregation runs a script on each bucket in the parent aggregation to determine whether the bucket is kept in the final aggregation tree. If the script returns true the bucket is retained, if it returns false the bucket is dropped
If you are using the default date or the named identifiers of dates,
the current implementation was allowed to read a year with only one
digit. In order to make this more strict, this fixes a year to be at
least 4 digits. Same applies for month, day, hour, minute, seconds.
Also the new default is `strictDateOptionalTime` for indices created
with Elasticsearch 2.0 or newer.
In addition a couple of not exposed date formats have been exposed, as they
have been mentioned in the documentation.
Closes#6158
The `:ref:` link in java-api doc is connected to `current` version which is at this time `1.6`.
This commit patch this.
That being said, we might have to change it again once master will become `current` doc.
This commit reorganizes the docs to make Java API docs looking more like the REST docs.
Also, with 2.0.0, FilterBuilders don't exist anymore but only QueryBuilders.
Also, all docs api move now to docs/java-api/docs dir as for REST doc.
Remove removed queries/filters
-----
* Remove Constant Score Query with filter
* Remove Fuzzy Like This (Field) Query (flt and flt_field)
* Remove FilterBuilders
Move filters to queries
-----
* Move And Filter to And Query
* Move Bool Filter to Bool Query
* Move Exists Filter to Exists Query
* Move Geo Bounding Box Filter to Geo Bounding Box Query
* Move Geo Distance Filter to Geo Distance Query
* Move Geo Distance Range Filter to Geo Distance Range Query
* Move Geo Polygon Filter to Geo Polygon Query
* Move Geo Shape Filter to Geo Shape Query
* Move Has Child Filter by Has Child Query
* Move Has Parent Filter by Has Parent Query
* Move Ids Filter by Ids Query
* Move Limit Filter to Limit Query
* Move MatchAll Filter to MatchAll Query
* Move Missing Filter to Missing Query
* Move Nested Filter to Nested Query
* Move Not Filter to Not Query
* Move Or Filter to Or Query
* Move Range Filter to Range Query
* Move Ids Filter to Ids Query
* Move Term Filter to Term Query
* Move Terms Filter to Terms Query
* Move Type Filter to Type Query
Add missing queries
-----
* Add Common Terms Query
* Add Filtered Query
* Add Function Score Query
* Add Geohash Cell Query
* Add Regexp Query
* Add Script Query
* Add Simple Query String Query
* Add Span Containing Query
* Add Span Multi Term Query
* Add Span Within Query
Reorganize the documentation
-----
* Organize by full text queries
* Organize by term level queries
* Organize by compound queries
* Organize by joining queries
* Organize by geo queries
* Organize by specialized queries
* Organize by span queries
* Move Boosting Query
* Move DisMax Query
* Move Fuzzy Query
* Move Indices Query
* Move Match Query
* Move Mlt Query
* Move Multi Match Query
* Move Prefix Query
* Move Query String Query
* Move Span First Query
* Move Span Near Query
* Move Span Not Query
* Move Span Or Query
* Move Span Term Query
* Move Template Query
* Move Wildcard Query
Add some missing pages
----
* Add multi get API
* Add indexed-scripts link
Also closes#7826
Related to https://github.com/elastic/elasticsearch/pull/11477#issuecomment-114745934
The work around for resolving `now` doesn't need to be used for aliases, becuase alias filters are parsed at search time. However it can't be removed, because the percolator relies on it.
Parent/child can be specified again in alias filters, this now works again because alias filters are parsed at search time. Parent/child will also use the late query parse work around, to make sure to do the final preparations when the search context is around. This allows the aliases api to validate the parent/child queries without failing because there is no search context.
Closes#10485
The filters aggregation now has an option to add an 'other' bucket which will, when turned on, contain all documents which do not match any of the defined filters. There is also an option to change the name of the 'other' bucket from the default of '_other_'
Closes#11289
Field stats index constraints allows to omit all field stats for indices that don't match with the constraint. An index
constraint can exclude indices' field stats based on the `min_value` and `max_value` statistic. This option is only
useful if the `level` option is set to `indices`.
For example index constraints can be useful to find out the min and max value of a particular property of your data in
a time based scenario. The following request only returns field stats for the `answer_count` property for indices
holding questions created in the year 2014:
curl -XPOST 'http://localhost:9200/_field_stats?level=indices' -d '{
"fields" : ["answer_count"] <1>
"index_constraints" : { <2>
"creation_date" : { <3>
"min_value" : { <4>
"gte" : "2014-01-01T00:00:00.000Z",
},
"max_value" : {
"lt" : "2015-01-01T00:00:00.000Z"
}
}
}
}'
Closes#11187
In order to be more consistent with what they do, the query cache has been
renamed to request cache and the filter cache has been renamed to query
cache.
A known issue is that package/logger names do no longer match settings names,
please speak up if you think this is an issue.
Here are the settings for which I kept backward compatibility. Note that they
are a bit different from what was discussed on #11569 but putting `cache` before
the name of what is cached has the benefit of making these settings consistent
with the fielddata cache whose size is configured by
`indices.fielddata.cache.size`:
* index.cache.query.enable -> index.requests.cache.enable
* indices.cache.query.size -> indices.requests.cache.size
* indices.cache.filter.size -> indices.queries.cache.size
Close#11569
Today, we disable CORS by default, but if a user simply enables CORS their instance of
elasticsearch will allow cross origin requests from anywhere, as the default value for allowed
origins is `*`.
This changes the default to be `null` so that no origins are allowed and the user must explicitly
specify the origins they wish to allow requests from. The documentation also mentions that there
is a security risk in using `*` as the value.
Closes#11169
In order to be backwards compatible, indices created before 2.x must support
indexing of a unix timestamp and its configured date format. Indices created
with 2.x must configure the `epoch_millis` date formatter in order to
support this.
Relates #10971
This adds a new pipeline aggregation, the cumulative sum aggregation. This is a parent aggregation which must be specified as a sub-aggregation to a histogram or date_histogram aggregation. It will add a new aggregation to each bucket containing the sum of a specified metrics over this and all previous buckets.
This commit consolidates several abstractions on the shard level in
ordinary classes not managed by the shard level guice injector.
Several classes have been collapsed into IndexShard and IndexShardGatewayService
was cleaned up to be more lightweight and self-contained. It has also been moved into
the index.shard package and it's operation is renamed from recovery from "gateway" to recovery
from "store" or "shard_store".
Closes#11847
Since there is a recommended version of JDK, it would be helpful to provide a link to the Oracle documentation. Since there are many versions of Java, those that are new or infrequent users of Java would find the link helpful. Thanks!
Closes#11792
we currently don't expose this.
This adds the following to the OS section of `_nodes`:
```
"os": {
"name": "Mac OS X",
...
}
```
and the following to the OS section of `_cluster/stats`:
```
"os": {
...
"names": [
{
"name": "Mac OS X",
"count": 1
}
],
...
},
```
Closes#11807
This is a follow up to #8143 and #6730 for _timestamp. It removes
support for `path`, as well as any field type settings, and
enables docvalues for _timestamp, for 2.0. Users who need to
adjust these settings can use a date field.
This fixes an issue to allow for negative unix timestamps.
An own printer for epochs instead of just having a parser has been added.
Added docs that only 10/13 length unix timestamps are supported
Added docs in upgrade documentation
Fixes#11478
Replace the previous example which leveraged a range filter, which causes unnecessary confusion about when to use a range filter to create a single bucket or a range aggregation with exactly one member in ranges.
Closes#11704
I was unable to get my BulkProcessor script to work without importing the "ByteSizeUnit" and "ByteSizeValue" classes. Perhaps I overlooked something in the example and do not understand its code.
This changes the parameter name `ignore_like` to the more user friendly name
`unlike`. This later feature generates a query from the terms in `A` but not
from the terms in `B`. This translates to a result set which is like `A` but
unlike `B`. We could have further negatively boosted any documents that have
some `B`, but these documents already do not receive any contribution from
having `B`, and would therefore negatively compete with documents having `A`.
Closes#11117
* QueryBuilders.queryString is now QueryBuilders.queryStringQuery
* DateHistogram.Interval is now DateHistogramInterval
* Refactoring of buckets in aggs
* FilterBuilders has been replaced by QueryBuilders
Closes#9976.
Now that doc values are the default for fielddata, specialized in-memory
formats are becoming an esoteric option. This commit removes such formats:
- `fst` on string fields,
- `compressed` on geo points.
I also removed documentation and tests that the fielddata cache is shared if
you change the format, since this is only true for in-memory fielddata formats
(given that for doc values, the caching is done directly in Lucene).
Information about in-progress snapshot and restore processes is not really metadata and should be represented as a part of the cluster state similar to discovery nodes, routing table, and cluster blocks. Since in-progress snapshot and restore information is no longer part of metadata, this refactoring also enables us to handle cluster blocks in more consistent manner and allow creation of snapshots of a read-only cluster.
Closes#8102
Today we provide the ability to plug in MergePolicy and
we provide the once lucene ships with. We do not recommend to change
the default and even only a small number of expert users would ever touch
this. This commit removes the ancient log byte size and log doc count
merge policy providers, simplifies the MergePolicy wiring and makes the
tiered MP the one and only default. All notions of a merge policy has been
removed from the docs and should be deprecated in the previous version.
Closes#11588
While we had initially planned to keep rivers around in 2.0 to ease migration,
keeping support for rivers is challenging as it conflicts with other important
changes that we want to bring to 2.0 like synchronous dynamic mappings updates.
Nothing impossible to fix, but it would increase the complexity of how we
deal with dynamic mappings updates and manage rivers, while handling dynamic
mappings updates correctly is important for resiliency and rivers are on the go.
So removing rivers in 2.0 may well be a better trade-off.
The ResourceWatcher used settings prefixed `watcher.`, which
potentially could clash with the watcher plugin.
In order to prevent confusion, the settings have been renamed to
`resource.reload` prefixes.
This also uses the deprecation logging infrastructure introduced
in #11033 to log deprecated settings and their alternative at
startup.
Closes#11175
There are different ways to register custom query parsers through plugins, a couple of them work per index via index settings, which is probably even too flexible. There also three different ways to add a global custom query parser through either IndicesQueriesModule or IndicesQueriesRegistry. This commit consolidates the registration of custom query parsers via IndicesQueriesModule#addQuery(Class<? extends QueryParser>). The complexity of supporting parsers per index is not needed hence it got removed. Also the other ways of registering global custom parsers are dropped in favour of the one mentioned above.
Closes#11481
In #10918, we introduced the prompt placeholders. These were had a different format
than our existing placeholders. This changes the prompt placeholders to follow the
format of the existing placeholders.
Relates to #11455
Some of our meta fields (such as _id, _version, ...) are returned as top-level
properties of the json document, while other properties (_timestamp, _routing,
...) are returned under `fields`. This commit makes all meta fields returned
as top-level properties.
So eg. `GET test/test/1?fields=_timestamp,foo` would now return
```json
{
"_index": "test",
"_type": "test",
"_id": "1",
"_version": 1,
"_timestamp": 10000000,
"found": true,
"fields": {
"foo": [ "bar" ]
}
}
```
while it used to return
```json
{
"_index": "test",
"_type": "test",
"_id": "1",
"_version": 1,
"found": true,
"fields": {
"_timestamp": 10000000,
"foo": [ "bar" ]
}
}
```
To better distribute the memory allocating to indexing, the IndexingMemoryController periodically checks the different shard for their last indexing activity. If no activity has happened for a while, the controller marks the shards as in active and allocated it's memory buffer budget (but a small minimal budget) to other active shards. The recently added synced flush feature (#11179, #11336) uses this inactivity trigger to attempt as a trigger to attempt adding a sync id marker (which will speed up future recoveries).
We wait for 30m before declaring a shard inactive. However, these days the operation just requires a refresh and is light. We can be stricter (and 5m) increase the chance a synced flush will be triggered.
Closes#11479