This option defaults to false, because it is also important to upgrade
the "merely old" segments since many Lucene improvements happen within
minor releases.
But you can pass true to do the minimal work necessary to upgrade to
the next major Elasticsearch release.
The HTTP GET upgrade request now also breaks out how many bytes of
ancient segments need upgrading.
Closes#10213Closes#10540
Conflicts:
dev-tools/create_bwc_index.py
rest-api-spec/api/indices.upgrade.json
src/main/java/org/elasticsearch/action/admin/indices/optimize/OptimizeRequest.java
src/main/java/org/elasticsearch/action/admin/indices/optimize/ShardOptimizeRequest.java
src/main/java/org/elasticsearch/action/admin/indices/optimize/TransportOptimizeAction.java
src/main/java/org/elasticsearch/index/engine/InternalEngine.java
src/test/java/org/elasticsearch/bwcompat/StaticIndexBackwardCompatibilityTest.java
src/test/java/org/elasticsearch/index/engine/InternalEngineTests.java
src/test/java/org/elasticsearch/rest/action/admin/indices/upgrade/UpgradeReallyOldIndexTest.java
This adds a new feature to the Term Vectors API which allows for filtering of
terms based on their tf-idf scores. With `dfs` option on, this could be useful
for finding out a good characteric vector of a document or a set of documents.
The parameters are similar to the ones used in the MLT Query.
Closes#9561
This commit adds a `rewrite` parameter to the validate API in order to shown
how the given query is re-written into primitive queries. For example, an MLT
query is re-written into a disjunction of the selected terms. Other use cases
include `fuzzy`, `common_terms`, or `match` query especially with a
`cutoff_frequency` parameter. Note that the explanation is only given for a
single randomly chosen shard only, so the output may vary from one shard to
another.
Relates #1412Closes#10147
This is really a Collector instead of a filter. This commit deprecates the
`limit` filter, makes it a no-op and recommends to use the `terminate_after`
parameter instead that we introduced in the meantime.
* In code, we mark `River`, `AbstractRiverComponent`, `RiverComponent` and `RiverName` classes as deprecated
* We log that information when a cluster is still using it
* We add this information in the plugins list as well
Today we check every regular expression eagerly against every possible term.
This can be very slow if you have lots of unique terms, and even the bottleneck
if your query is selective.
This commit switches to Lucene regular expressions instead of Java (not exactly
the same syntax yet most existing regular expressions should keep working) and
uses the same logic as RegExpQuery to intersect the regular expression with the
terms dictionary. I wrote a quick benchmark (in the PR) to make sure it made
things faster and the same request that took 750ms on master now takes 74ms with
this change.
Close#7526
For bacwards compatibility reasons routing_nodes were previously printed out when routing_table was requested, together with the actual routing_table. Now they are printed out only when requests through `routing_nodes` flag.
Relates to #10412Closes#10486
Removed the following methods from `ScriptService`, which don't require the `ScriptContext` argument:
```
public CompiledScript compile(String lang, String script, ScriptType scriptType)
public ExecutableScript executable(String lang, String script, ScriptType scriptType, Map<String, Object> vars)
public SearchScript search(SearchLookup lookup, String lang, String script, ScriptType scriptType, @Nullable Map<String, Object> vars)
```
Also removed the ScriptContext.Standard.GENERIC_PLUGIN enum value, as it was used only for backwards compatibility.
Plugins that make use of scripts should declare their own script contexts through `ScriptModule#registerScriptContext` and use them when compiling/executing scripts.
Closes#10476
Plugins can now define multiple operations/contexts that they use scripts for. Fine-grained settings can then be used to enable/disable scripts based on each single registered context.
Also added a new generic category called `plugin`, which will be used as a default when the context is not specified. This allows us to restore backwards compatibility for plugins on `ScriptService` by restoring the old methods that don't require the script context and making them internally use the `plugin` context, as they can only be called from plugins.
Closes#10347Closes#10419
Relates to #10154 and #10150
Adds link to additional information on how document frequencies are treated across shards to the cutoff_frequency parameter documentation.
Closes#10451
We had an undocumented parameter called `numeric_resolution` which allows to
configure how to deal with dates when provided as a number. The default is to
handle them as milliseconds, but you can also opt-on for eg. seconds.
Close#10072
This pull request makes boolean handled like dates and ipv4 addresses: things
are stored as as numerics under the hood and aggregations add some special
formatting logic in order to return true/false in addition to 1/0.
For example, here is an output of a terms aggregation on a boolean field:
```
"aggregations": {
"top_f": {
"doc_count_error_upper_bound": 0,
"buckets": [
{
"key": 0,
"key_as_string": "false",
"doc_count": 2
},
{
"key": 1,
"key_as_string": "true",
"doc_count": 1
}
]
}
}
```
Sorted numeric doc values are used under the hood.
Close#4678Close#7851
Now that fine-grained script settings are supported (#10116) we can remove support for the script.disable_dynamic setting.
Same result as `script.disable_dynamic: false` can be obtained as follows:
```
script.inline: on
script.indexed: on
```
An exception is thrown at startup when the old setting is set, so we make sure we tell users they have to change it rather than ignoring the setting.
Closes#10286
This commit brings the benefits of the `count` search type to search requests
that have a `size` of 0:
- a single round-trip to shards (no fetch phase)
- ability to use the query cache
Since `count` now provides no benefits over `query_then_fetch`, it has been
deprecated.
Close#7630
Allow to on/off scripting based on their source (where they get loaded from), the operation that executes them and their language.
The settings cover the following combinations:
- mode: on, off, sandbox
- source: indexed, dynamic, file
- engine: groovy, expressions, mustache, etc
- operation: update, search, aggs, mapping
The following settings are supported for every engine:
script.engine.groovy.indexed.update: sandbox/on/off
script.engine.groovy.indexed.search: sandbox/on/off
script.engine.groovy.indexed.aggs: sandbox/on/off
script.engine.groovy.indexed.mapping: sandbox/on/off
script.engine.groovy.dynamic.update: sandbox/on/off
script.engine.groovy.dynamic.search: sandbox/on/off
script.engine.groovy.dynamic.aggs: sandbox/on/off
script.engine.groovy.dynamic.mapping: sandbox/on/off
script.engine.groovy.file.update: sandbox/on/off
script.engine.groovy.file.search: sandbox/on/off
script.engine.groovy.file.aggs: sandbox/on/off
script.engine.groovy.file.mapping: sandbox/on/off
For ease of use, the following more generic settings are supported too:
script.indexed: sandbox/on/off
script.dynamic: sandbox/on/off
script.file: sandbox/on/off
script.update: sandbox/on/off
script.search: sandbox/on/off
script.aggs: sandbox/on/off
script.mapping: sandbox/on/off
These will be used to calculate the more specific settings, using the stricter setting of each combination. Operation based settings have precedence over conflicting source based ones.
Note that the `mustache` engine is affected by generic settings applied to any language, while native scripts aren't as they are static by definition.
Also, the previous `script.disable_dynamic` setting can now be deprecated.
Closes#6418Closes#10116Closes#10274
Deleting a type from an index is inherently dangerous because
the type can be recreated with new mappings which may conflict
with existing segments still using the old mappings. This
removes the ability to delete a type (similar to how deleting
fields within a type is not allowed, for the same reason).
closes#8877closes#10231
Documentation states false as the default for "validate", "validate_lon", and "validate_lat" leading to confusion as described in issue #9539. This simple fix corrects the documentation and communicates that these fields will be deprecated and removed in upcoming versions.
closes#9539
I've been attempting to programatically verify that adding index templates via the `{path.conf}/templates/` directory works fine although I was never able to validate this via an API call to the `/_template/`. It seems that these templates do not appear in that API call, which I discovered in the following mail thread:
http://elasticsearch-users.115913.n3.nabble.com/Loading-of-index-settings-template-from-file-in-config-templates-td4024923.html#d1366317284000-912
My question is why wouldn't the `/_template/*` method return these templates? This tends to complicate things for those that want to perform automated tests to verify that they are in fact being recognized and used by Elasticsearch.
This commit changes the behaviour of the delete api when processing a delete request that refers to a type that has routing set to required in the mapping, and the routing is missing in the request. Up until now the delete api sent a broadcast delete request to all of the shards that belong to the index, making sure that the document could be found although the routing value wasn't specified. This was probably not the best choice: if the routing is set to required, an error should be thrown instead.
A `RoutingMissingException` gets now thrown instead, like it happens in the same situation with every other api (index, update, get etc.). Last but not least, this change allows to get rid of a couple of `TransportAction`s, `Request`s and `Response`s and simplify the codebase.
Closes#9123Closes#10136
Adds a setting to disable detailed error messages and full exception stack traces
in HTTP responses. When set to false, the error_trace request parameter will result
in a HTTP 400 response. When the error_trace parameter is not present, the message
of the first ElasticsearchException will be output and no nested exception messages
will be output.
This commit adds the current total number of translog operations to the recovery reporting API. We also expose the recovered / total percentage:
```
"translog": {
"recovered": 536,
"total": 986,
"percent": "54.3%",
"total_time": "2ms",
"total_time_in_millis": 2
},
```
Closes#9368Closes#10042
The behaviour is better in the case someone has multiple levels of nested object fields defined in the mapping and like to define a single inner_hits definition that is two or more levels deep.
If someone wants inner hits on a nested field that is 2 levels deep the following would need to be defined:
```
{
...
"inner_hits" : {
"path" : {
"level1" : {
"inner_hits" : {
"path" : {
"level2" : {
"query" : { .... }
}
}
}
}
}
}
}
```
With this change the above can be defined as:
```
{
...
"inner_hits" : {
"path" : {
"level1.level2" : {
"query" : { .... }
}
}
}
}
```
Closes#9251
The analysis chain should be used instead of relying on this, as it is
confusing when dealing with different per-field analysers.
The `locale` option was only used for `lowercase_expanded_terms`, which,
once removed, is no longer needed, so it was removed as well.
Fixes#9978
Relates to #9973
This commit adds scripting capability to significant_terms.
Custom heuristics can be implemented with a script that provides
parameters subset_freq, superset_freq,subset_size, superset_size.
closes#7850
Changed search_type docs to reflect that the `(dfs_)query_and_fetch` modes are an internal optimization and should not be specified explicitly by the user.
Relates to #9606
As explained in elasticsearch/elasticsearch-mapper-attachments#101, we should have consistent documentation.
The best option is to link the documentation in elasticsearch guide to the most recent README in the plugin repo.
Closes#9756
The request tracer logs in TRACE level under the `transport.tracer` log and is dynamically configurable with include and exclude arrays to filter out unneeded info. By default all requests are logged with the exception of fault detection pings (fired every second).
add the notion of tracers in the MockTransportService for testing purposes
Closes#9286
To support the `_recovery` API, the recovery process keeps track of current progress in a class called RecoveryState. This class currently have some issues, mostly around concurrency (see #6644 ). This PR cleans it up as well as other issues around it:
- Make the Index subsection API cleaner:
- remove redundant information - all calculation is done based on the underlying file map
- clearer definition of what is what: total files, vs reused files (local files that match the source) vs recovered files (copied over). % based progress is reported based on recovered files only.
- cleaned up json response to match other API (sadly this breaks the structure). We now properly report human values for dates and other units.
- Add more robust unit testing
- Detail flag was passed along as state (it's now a ToXContent param)
- State lookup during reporting is now always done via the IndexShard , no more fall backs to many other classes.
- Cleanup APIs around time and move the little computations to the state class as opposed to doing them out of the API
I also improved error messages out of the REST testing infra for things I run into.
Closes#6644Closes#9811
Together with #8782 it should help in the situations simliar to #8887 by adding an ability to get information about currently running snapshot without accessing the repository itself.
Closes#8887
The number of current pending tasks is useful to detect and overloaded master. This commit adds it to the cluster health API. The complete list can be retrieved from the dedicated pending tasks API.
It also adds rest tests for the cluster health variants.
Closes#9877