Commit Graph

4808 Commits

Author SHA1 Message Date
William Brafford 14204f8381
Use set-based interface for NodesStatsRequest (#53637) (#54141)
The NodesStatsRequest class uses a set of strings for its internal
serialization. This commit updates the class's interface so that we
no longer use hard-coded getters and setters, but rather
methods that add strings directly. For example, the old way of
adding "os" metrics to a request would be to call request.os(true).
The new way of doing this is to call request.addMetric("os").

For the time being, the canonical list of metrics is an enum in
NodesStatsRequest. This will eventually be replaced with something
pluggable.
2020-03-26 14:41:49 -04:00
Christoph Büscher da404bbce2
HLRC: Don't send defaults for SubmitAsyncSearchRequest (#54200) (#54266)
Currently we set the defaults for ccsMinimizeRoundtrips, preFilterShardSize and
requestCache on the HLRC SubmitAsyncSearchRequest in the constructor. This is no
longer needed since we now only send the parameters along with the rest request
that are supported (omitting e.g. ccsMinimizeRoundtrips) and the correct
defaults are set on the client side. This change removes setting and sending
these defaults where possible, leaving only the overwrite of batchedReduceSize
with a default value of 5, since the default used in the vanilla SearchRequest
is 512. However, we don't need to send this value along as a request parameter
if its the default since the correct one will be set on the receiving end if no
value is specified.
Also adding tests for RestSubmitAsyncSearchAction that check the correct
defaults are set when parameters are missing on the server side.

Backport of #54200
2020-03-26 19:01:17 +01:00
Nik Everett b6a8de0d89
Fix compilation in Eclipse (backport of #54275) (#54284)
These mock calls cause Eclipse to think that `Exception` can be thrown
because `CheckedFunction`'s lower bound is `Exception`. This makes
Eclipse happy.
2020-03-26 12:53:13 -04:00
Lee Hinman 3e1fc8a5c9
[7.x] #32478 fixed cluster stats return 8EB (#32480) (32972311) (#54273)
Co-authored-by: Darby.Han <darby.han@navercorp.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: 한우람 <hgword@gmail.com>
Co-authored-by: Darby.Han <darby.han@navercorp.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-03-26 09:12:09 -06:00
Henning Andersen 5bfaa20dd4
Rollover: refactor out cluster state update (#53965) (#54269)
Make it possible to reuse the cluster state update of rollover for
simulation purposes by extracting it. Also now run the full rollover in
the pre-rollover phase and the actual rollover phase, allowing a
dedicated exception in case of concurrent rollovers as well as a more
thorough pre-check.
2020-03-26 16:06:13 +01:00
Yannick Welsch 8176f1c7f8
Delegate toXContent logic from ClusterState to its member classes (#54192)
Backport of #52743

Co-authored-by: Yannick Welsch <yannick@welsch.lu>
2020-03-26 15:08:30 +01:00
Alan Woodward 048797389a Explicitly set mappings in SearchAfterIT (#54262)
We have some very occasional failures in SearchAfterIT, where a search throws
an exception because a shard does not have the mapping for the requested sort
field. The field should have been added in a dynamic mapping update after an
index event, but it seems that there can sometimes be a small delay in propagating
this update to the shards.

This commit changes the test to explicitly define the relevant field at index creation
time.

Fixes #51900
2020-03-26 13:05:19 +00:00
Jason Tedor d8f745736b
Clarify the remove keystore command can handle many (#54244)
The remove keystore command can handle multiple settings. In a few
places, we were not consistent about mentioning this. This commit
addreses this, in the CLI help, and the docs.
2020-03-26 08:49:43 -04:00
William Brafford b11960e3e6
Create set-based interface for NodesInfoRequest (#53410) (#54223)
This commit begins the work of removing the "hard-coded" metric getters
and setters from the NodesInfoRequest classes. We start by providing new
flexible getters and setters. We then update the test classes to remove
the old getters, and then remove those getters.
2020-03-26 07:28:09 -04:00
Yannick Welsch 1ba6783780 Schedule commands in current thread context (#54187)
Changes ThreadPool's schedule method to run the schedule task in the context of the thread
that scheduled the task.

This is the more sensible default for this method, and eliminates a range of bugs where the
current thread context is mistakenly dropped.

Closes #17143
2020-03-26 10:07:59 +01:00
Jason Tedor 6af89e62d1
Allow keystore add-file to handle multiple settings (#54240)
Today the keystore add-file command can only handle adding a single
setting/file pair in a single invocation. This incurs the startup costs
of the JVM many times, which in some environments can be expensive. This
commit teaches the add-file keystore command to accept adding multiple
settings in a single invocation.
2020-03-26 00:07:05 -04:00
Jason Tedor fe8257d981
Allow keystore add to handle multiple settings (#54229)
Today the keystore add command can only handle adding a single
setting/value pair in a single invocation. This incurs the startup costs
of the JVM many times, which in some environments can be expensive. This
commit teaches the add keystore command to accept adding multiple
settings in a single invocation.
2020-03-25 22:58:20 -04:00
Nik Everett 8f40f1435a
Save a little space in agg tree (backport of #53730) (#54213)
This drop the "top level" pipeline aggregators from the aggregation
result tree which should save a little memory and a few serialization
bytes. Perhaps more imporantly, this provides a mechanism by which we
can remove *all* pipelines from the aggregation result tree. This will
save quite a bit of space when pipelines are deep in the tree.

Sadly, doing this isn't simple because of backwards compatibility. Nodes
before 7.7.0 *need* those pipelines. We provide them by setting passing
a `Supplier<PipelineTree>` into the root of the aggregation tree that we
only call if we need to serialize to a version before 7.7.0.

This solution works for cross cluster search because we always reduce
the aggregations in each remote cluster and then forward them back to
the coordinating node. Its quite possible that the coordinating node
needs the pipeline (say it is version 7.1.0) and the gateway node in the
remote cluster doesn't (version 7.7.0). In that case the data nodes
won't send the pipeline aggregations back to the gateway node.
Critically, the gateway node *will* send the pipeline aggregations back
to the coordinating node. This is all managed with that
`Supplier<PipelineTree>`, but *how* it is managed is a bit tricky.
2020-03-25 15:51:16 -04:00
Martijn van Groningen 66861a82a1
Data stream should refer to the backing indices using the Index class (#54199)
Backport of #54189
2020-03-25 20:18:15 +01:00
Bogdan Pintea 77da9dd040 Add version 7.8.0
Add version 7.8.0
2020-03-25 18:10:30 +01:00
jimczi 04fabead14 Fix small typo in SearchService#executeQueryPhase
Relates #54044
2020-03-25 16:16:30 +01:00
Armin Braun 32fa90c9ba
Fix ClusterHealthIT.testHealthOnMasterFailover (#54170) (#54177)
We can run into a state where there's no more events to wait for temporarily
but the cluster still isn't green. I added the wait for green flag to the request
so the assertion for green cluster health below doesn't fail.

Closes #53457
2020-03-25 14:33:31 +01:00
Armin Braun 0a70250201
Fix BlobStoreIncrementalityIT Assertion (#54149) (#54155)
We are using this assertion for identical shard snapshots
for situations where `snapshot1` wasn't the first snapshot
for the tested shard. Hence, we can't assume that it will
not share any files with previous snapshots.
This showed up in failing tests when `snapshot1` was equivalent
to a previous snapshot because no documents were deleted from the
repo randomly in the failing test but even if documents are deleted
there is no guarantee that no files will be shared.

=> I removed this assertion since its immaterial for what is tested
here anyway.

Closes #54034
2020-03-25 12:13:52 +01:00
jimczi e380a5a8c3 Fix off-by one error in TransportSearchActionTests
Closes #54156
2020-03-25 11:41:02 +01:00
Tanguy Leroux b6e482295b
Mute TransportSearchActionTests.testShouldPreFilterSearchShards (#54158)
Relates #53873
Relates #54156
2020-03-25 11:25:06 +01:00
Jim Ferenczi 3b4751bdb7
Avoid I/O operations when rewriting shard search request (#54044) (#54139)
This commit ensures that we rewrite the shard request with a short-lived can_match searcher.
This is required for frozen indices since the high level rewrite is now performed on a network thread where we don't want to perform I/O.

Closes #53985
2020-03-25 09:02:36 +01:00
Jason Tedor 381d7586e4
Introduce formal role for remote cluster client (#54138)
This commit introduce a formal role for identifying nodes that are
capable of making connections to remote clusters.

Relates #53924
2020-03-24 21:59:43 -04:00
Henning Andersen 7ce7aff66e Reindex negative TimeValue fix (#54057)
Reindex would use timeValueNanos(System.nanoTime()). The intended use
for TimeValue is as a duration, not as absolute time. In particular,
this could result in negative TimeValue's, being unsupported in #53913.
Modified to use the bare long nano-second value.
2020-03-24 22:29:09 +01:00
Przemko Robakowski fc498f625a
[7.x] Add validation for component templates (#54023) (#54118)
* Add validation for component templates (#54023)

* Add validation for component templates

This change adds validation to make sure that settings and mappings are correct in
component template. It's done the same way as in index templates - code is reused.

Reletes to #53101

* Fix checkstyle violation

* Update server/src/main/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateService.java

Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com>

* Update server/src/test/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateServiceTests.java

Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com>

* Update server/src/test/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateServiceTests.java

Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com>

* Update server/src/test/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateServiceTests.java

Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com>

* Update server/src/main/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateService.java

Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com>

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>

* Adjusted to 7.7

* unused import fixed

* npe fixeD

* change exception type

Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
2020-03-24 22:20:34 +01:00
Jim Ferenczi 025857d949 Fix ShardSearchRequest cache key (#54071)
This commit ensures that we don't use the non-deterministic canReturnNullResponseIfMatchNoDocs
boolean in the cache key of the ShardSearchRequest. The value of this boolean has no influence
on the cacheability of the request.

Closes #32827
2020-03-24 22:15:52 +01:00
Przemko Robakowski 5594d57727
/_cat/shards support path stats (#53461) (#54119)
* _cat/shards support path stats

* fix some style case

* fix some style case

* fix rest-api-spec cat.shards error

* fix rest-api-spec cat.shards bwc error

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: weizijun <weizijun1989@gmail.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-03-24 20:46:05 +01:00
Tim Brooks b21b7fb09b
Allow proxy mode server name to be updated (#54107)
Currently there is a bug where the proxy strategy will not be rebuilt if
the server_name is dynamically updated. This commit fixes this issue.
2020-03-24 11:54:23 -06:00
David Turner 21afc788f8 Reduce log level for pipeline failure (#54097)
Today we log `failed to execute pipeline for a bulk request` at `ERROR` level
if an attempt to run an ingest pipeline fails. A failure here is commonly due
to an `EsRejectedExecutionException`. We also feed such failures back to the
client and record the rejection in the threadpool statistics.

In line with #51459 there is no need to log failures within actions so noisily
and with such urgency. It is better to leave it up to the client to react
accordingly. Typically an `EsRejectedExecutionException` should result in the
client backing off and retrying, so a failure here is not normally fatal enough
to justify an `ERROR` log at all.

This commit reduces the log level for this message to `DEBUG`.
2020-03-24 17:41:35 +00:00
markharwood 6a60f85bba
Wildcard field - add normalizer support (#53851) (#54109)
Backport support for normalisation to wildcard field

Closes #53603
2020-03-24 17:37:47 +00:00
Yannick Welsch e006d1f6cf Use special XContent registry for node tool (#54050)
Fixes an issue where the elasticsearch-node command-line tools would not work correctly
because PersistentTasksCustomMetaData contains named XContent from plugins. This PR
makes it so that the parsing for all custom metadata is skipped, even if the core system would
know how to handle it.

Closes #53549
2020-03-24 17:40:51 +01:00
Nik Everett 42be39177b
Remove ceremony declaring aggs (backport of #53990) (#54099)
This removes some more ceremony when declaring agg parsers. You no
longer need a static `parse` method, instead you can just make the
`PARSER` public in most cases.

There are still a few aggs with the `parse` method, but those `parse`
methods are a little more complex to untangle.
2020-03-24 12:29:52 -04:00
Tim Brooks caefa78513
Align remote info api with new settings (#54102)
Currently the remote info api has added a number of possible fields
(proxy, num_socket_connections, etc) that are available in proxy mode.
These fields are not aligned with what the settings are named. This
commit modifies this API to align with the settings.
2020-03-24 10:27:24 -06:00
Christoph Büscher 1c1730facd
Mask wildcard query special characters on keyword queries (#53127) (#53512)
Wildcard queries on keyword fields get normalized, however this normalization
step should exclude the two special characters * and ? in order to keep the
wildcard query itself intact.

Closes #46300
2020-03-24 17:22:29 +01:00
Alan Woodward 39d7d0dc10 Upgrade to lucene 8.5.0 release (#54077)
Upgrades our lucene dependency to the released 8.5.0 version.
2020-03-24 13:45:50 +00:00
Dan Hermann 30105a5ab5
[7.x] Cluster state and CRUD operations for data streams (#54073) 2020-03-24 07:58:52 -05:00
Armin Braun 4e462db2ed
Fix BlobStoreIncrementalityIT (#54055) (#54060)
The snapshot stats response list of snapshot statuses is not ordered according to the
given list of snapshot names so randomly we could mix up snapshot1 and snapshot2
when asserting on the stats.
Fixed by getting each snapshot's stats individually.

Closes #54034
2020-03-24 11:46:40 +01:00
muachilin b33fbe7026
Deprecate alternatives to the hot threads API (#52930)
This commit deprecates various undocumented alternatives to the hot
threads API.
2020-03-23 23:24:40 -04:00
Jim Ferenczi 9e3f7f4575
Add heuristics to compute pre_filter_shard_size when unspecified (#53873) (#54007)
This commit changes the pre_filter_shard_size default from 128 to unspecified.
This allows to apply heuristics based on the request and the target indices when deciding
whether the can match phase should run or not. When unspecified, this pr runs the can match phase
automatically if one of these conditions is met:
  * The request targets more than 128 shards.
  * The request contains read-only indices.
  * The primary sort of the query targets an indexed field.
Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value.

Closes #39835
2020-03-24 02:05:15 +01:00
Nik Everett 4734c645f1
Fix serialization bug for aggs (#54029)
I created this bug today in #53793. When a `DelayableWriteable` that
references an existing object serializes itself it wasn't taking the
version of the node on the other side of the wire into account. This
fixes that.
2020-03-23 19:00:47 -04:00
Jason Tedor 5c96a7e210
Fix compilation in RemoteClusterServiceTests
This commit fixes an issue when a JDK collection convenience method not
available in JDK 8 was backported to 7.x.
2020-03-23 18:41:17 -04:00
Jason Tedor d3cc5bff17
Give helpful message on remote connections disabled (#53690)
Today when cluster.remote.connect is set to false, and some aspect of
the codebase tries to get a remote client, today we return a no such
remote cluster exception. This can be quite perplexing to users,
especially if the remote cluster is actually defined in their cluster
state, it is only that the local node is not a remote cluter
client. This commit addresses this by providing a dedicated error
message when a remote cluster is not available because the local node is
not a remote cluster client.
2020-03-23 18:32:38 -04:00
Mark Vieira 70cfedf542
Refactor global build info plugin to leverage JavaInstallationRegistry (#54026)
This commit removes the configuration time vs execution time distinction
with regards to certain BuildParms properties. Because of the cost of
determining Java versions for configuration JDK locations we deferred
this until execution time. This had two main downsides. First, we had
to implement all this build logic in tasks, which required a bunch of
additional plumbing and complexity. Second, because some information
wasn't known during configuration time, we had to nest any build logic
that depended on this in awkward callbacks.

We now defer to the JavaInstallationRegistry recently added in Gradle.
This utility uses a much more efficient method for probing Java
installations vs our jrunscript implementation. This, combined with some
optimizations to avoid probing the current JVM as well as deferring
some evaluation via Providers when probing installations for BWC builds
we can maintain effectively the same configuration time performance
while removing a bunch of complexity and runtime cost (snapshotting
inputs for the GenerateGlobalBuildInfoTask was very expensive). The end
result should be a much more responsive build execution in almost all
scenarios.

(cherry picked from commit ecdbd37f2e0f0447ed574b306adb64c19adc3ce1)
2020-03-23 15:30:10 -07:00
Mark Vieira be1b34c3f8
Mute BlobStoreIncrementalityIT.testIncrementalBehaviorOnPrimaryFailover 2020-03-23 15:15:30 -07:00
Nik Everett b9bfba2c8b
Move pipeline agg validation to coordinating node (backport of #53669) (#54019)
This moves the pipeline aggregation validation from the data node to the
coordinating node so that we, eventually, can stop sending pipeline
aggregations to the data nodes entirely. In fact, it moves it into the
"request validation" stage so multiple errors can be accumulated and
sent back to the requester for the entire request. We can't always take
advantage of that, but it'll be nice for folks not to have to play
whack-a-mole with validation.

This is implemented by replacing `PipelineAggretionBuilder#validate`
with:
```
protected abstract void validate(ValidationContext context);
```

The `ValidationContext` handles the accumulation of validation failures,
provides access to the aggregation's siblings, and implements a few
validation utility methods.
2020-03-23 17:22:56 -04:00
Jason Tedor bc7b995523
Use deprecation logger holder in byte size value (#53928)
If a setting is touched during bootstrap before logging is configured,
and that setting uses a byte size value, the deprecation logger for
ByteSizeValue will be initialized. However, this means a logger will be
configured before log4j is initialized, which we reject at startup. This
commit puts this deprecation logger in a holder pattern so that it is
not initialized until first use, which will happen after logging is
configured.
2020-03-23 17:06:12 -04:00
Marios Trivyzas 3a3e964956
Reduce performance impact of ExitableDirectoryReader (#53978) (#54014)
Benchmarking showed that the effect of the ExitableDirectoryReader
is reduced considerably when checking every 8191 docs. Moreover,
set the cancellable task before calling QueryPhase#preProcess()
and make sure we don't wrap with an ExitableDirectoryReader at all
when lowLevelCancellation is set to false to avoid completely any
performance impact.

Follows: #52822
Follows: #53166
Follows: #53496

(cherry picked from commit cdc377e8e74d3ca6c231c36dc5e80621aab47c69)
2020-03-23 21:30:34 +01:00
Nik Everett 181bc807be
Try to save memory on aggregations (backport of #53793) (#53996)
This delays deserializing the aggregation response try until *right*
before we merge the objects.
2020-03-23 15:45:22 -04:00
Dan Hermann ce31997ab2
disable check for non-snapshot builds for data streams feature flag (#54000) 2020-03-23 13:29:51 -05:00
Luca Cavanna 932a7e3112
Backport of async search changes (#53976)
* Get Async Search: omit _clusters section when empty (#53907)

The _clusters section is omitted by the search API whenever no remote clusters are searched. Async search should do the same, but Get Async Search returns a deserialized response, hence a weird `_clusters` section with all values set to `0` gets returned instead. In fact the recreated Clusters object is not the same object as the EMPTY constant, yet it has the same content.

This commit addresses this by changing the comparison in the `toXContent` method to not print out the section if the number of total clusters is `0`.

* Async search: remove version from response (#53960)

The goal of the version field was to quickly show when you can expect to find something new in the search response, compared to when nothing has changed. This can also be done by looking at the `_shards` section and `num_reduce_phases` returned with the search response. In fact when there has been one or more additional reduction of the results, you can expect new results in the search response. Otherwise, the `_shards` section could notify of additional failures of shards that have completed the query, but that is not a guarantee that their results will be exposed (only when the following partial reduction is performed their results will be available).

That said this commit clarifies this in the docs and removes the version field from the async search response

* Async Search: replicas to auto expand from 0 to 1 (#53964)

This way single node clusters that are green don't go yellow once async search is used, while
all the others still have one replica.

* [DOCS] address timing issue in async search docs tests (#53910)

The docs snippets for submit async search have proven difficult to test as it is not possible to guarantee that you get a response that is not final, even when providing `wait_for_completion=0`. In the docs we want to show though a proper long-running query, and its first response should be partial rather than final.

With this commit we adapt the docs snippets to show a partial response, and replace under the hood all that's needed to make the snippets tests succeed when we get a final response. Also, increased the timeout so we always get a final response.

Closes #53887
Closes #53891
2020-03-23 19:13:31 +01:00
Ryan Ernst 960d1fb578
Revert "Introduce system index APIs for Kibana (#53035)" (#53992)
This reverts commit c610e0893d.

backport of #53912
2020-03-23 10:29:35 -07:00
Armin Braun 5b9864db2c
Better Incrementality for Snapshots of Unchanged Shards (#52182) (#53984)
Use sequence numbers and force merge UUID to determine whether a shard has changed or not instead before falling back to comparing files to get incremental snapshots on primary fail-over.
2020-03-23 16:43:41 +01:00
Tanguy Leroux 8b9d6e6dbb Increase ensureGreen() timeout in CloseWhileRelocatingShardsIT (#53981)
The test in CloseWhileRelocatingShardsIT failed recently 
multiple times (3) when waiting for initial indices to be 
become green. Looking at the execution logs from  #53544
 it appears at the very beginning of the test and when 
the WindowsFS file system is picked up (which is known 
to slow down tests).

This commit simply increases the timeout for the first 
ensureGreen() to 60 seconds. If the test continues to fail, 
we might want to test a larger timeout or disable 
WindowsFS for this test.

Closes #53544
2020-03-23 16:24:25 +01:00
Martijn van Groningen aef7b89219
Backport: initial data stream commit (#53959)
This commits adds a data stream feature flag, initial definition of a data stream and
the stubs for the data stream create, delete and get APIs. Also simple serialization
tests are added and a rest test to thest the data stream API stubs.

This is a large amount of code and mainly mechanical, but this commit should be
straightforward to review, because there isn't any real logic.

The data stream transport and rest action are behind the data stream feature flag and
are only intialized if the feature flag is enabled. The feature flag is enabled if
elasticsearch is build as snapshot or a release build and the
'es.datastreams_feature_flag_registered' is enabled.

The integ-test-zip sets the feature flag if building a release build, otherwise
rest tests would fail.

Relates to #53100
2020-03-23 12:58:09 +01:00
David Turner 0fb31d9e7a Allow static cluster.max_voting_config_exclusions (#53717)
Today we only read `cluster.max_voting_config_exclusions` from the dynamic
settings in the cluster metadata, ignoring any value set in
`elasticsearch.yml`. This commit addresses this.

Closes #53455
2020-03-23 08:38:12 +00:00
Ignacio Vera efd1838206
Handle properly indexing rectangles that crosses the dateline (#53810) (#53947)
When indexing a rectangle that crosses the dateline, we are currently not
handling it properly and we index a polygon that do not cross the dateline.
This changes generates two polygons wrapping the dateline.
2020-03-23 09:12:03 +01:00
Stuart Tettemer d25c01a373
Scripting: Increase ingest script cache defaults (#53906)
* Adds ability for contexts to specify their own defaults.
* Context defaults are applied if no context-specific or
  general setting exists.
* See 070ea7e for settings keys.

* Increases the per-context default for the `ingest` context.
  * Cache size is doubled, 200 compared to default of 100
  * Cache expiration is unchanged at no expiration
  * Cache max compilation is quintupled, 375/5m instead of 75/5m

Backport of: 1b37d4b
Refs: #50152
2020-03-20 16:48:50 -06:00
Gordon Brown 10cabbbade
Transition Transforms to using hidden indices for notifcations index (#53773)
This commit changes the Transforms notifications index to be hidden
index, with a hidden alias.

This commit also removes the temporary hack in
MetaDataCreateIndexService that prevents deprecation warnings for known
dot-prefixed index names which are not hidden/system indices, as this
was the last index pattern to need that hack.
2020-03-20 15:40:58 -06:00
Stuart Tettemer ac575b68a9
Scripting: Context script cache unlimited compile (#53769) (#53899)
* Adds "unlimited" compilation rate for context script caches
* `script.context.${CONTEXT}.max_compilations_rate` = `unlimited`
  disables compilation rate limiting for `${CONTEXT}`'s script
  cache

Refs: #50152
2020-03-20 15:14:30 -06:00
Lee Hinman 1f3de2fa7e
Set feature flags for IndexTemplatesV2 in top-level gradle file (#53898)
Resolves #53892
2020-03-20 14:52:22 -06:00
Gordon Brown f0674af132
Add isHidden to AliasActions equals/hashcode (#53700)
This commit adds the `isHidden` flag to the `equals` and
`hashCode` methods for `AliasActions`.
2020-03-20 13:59:40 -06:00
David Turner 879e26ec06 Describe STALE_STATE_CONFIG in ClusterFormationFH (#53878)
We mark cluster states persisted on master-ineligible nodes as
potentially-stale using the voting configuration `{STALE_STATE_CONFIG}` which
prevents these nodes from being elected as master if they are restarted as
master-eligible. Today we do not handle this special voting configuration
differently in the `ClusterFormationFailureHandler`, leading to a mysterious
message `an election requires a node with id [STALE_STATE_CONFIG]` if the
election does not succeed.

This commit adds a special case description for this situation to explain
better why this node cannot win an election.

Closes #53734
2020-03-20 20:02:51 +01:00
Igor Motov 88d50ec583
Fix random failures in InternalTopHitsTests#testReduceRandom (#53832)
The test was randomly and very rarely failing due to generating the same sort
key for multiple records, which was making order of these records in the results
nondeterministic. While investigating the test I also found that the data wasn't
generated in the way that matches the actual data. Normally, the order of
documents in hits and scoreDocs in InternalTopHits should be the same. However,
in the test only scoreDocs were sorted which was cause very confusing failure
messages. This commit fixes this issue as well.

Fixes #53676
2020-03-20 13:35:59 -04:00
David Turner adfeb50a53 Use consistent threadpools in CoordinatorTests (#53868)
Today in the `CoordinatorTests` each node uses multiple threadpools. This is
mostly fine as they are almost completely stateless, except for the
`ThreadContext`: by using multiple threadpools we cannot make assertions that
the thread context is/isn't preserved as we expect. This commit consolidates
the threadpool instances in use so that each node uses just one.
2020-03-20 16:22:42 +01:00
Alan Woodward a3f21f24ea
Emit deprecation warning when TermsLookup contains a type (#53731)
TermsLookup in master no longer accepts a type parameter. We should emit
a deprecate warning in 7.x when a terms lookup requests includes type to prepare
users for its removal.

Relates to #41059
2020-03-20 15:11:31 +00:00
Christoph Büscher 8eacb153df
Add async_search.submit to HLRC #53592 (#53852)
This commit adds a new AsyncSearchClient to the High Level Rest Client which
initially supporst the submitAsyncSearch in its blocking and non-blocking
flavour. Also adding client side request and response objects and parsing code
to parse the xContent output of the client side AsyncSearchResponse together
with parsing roundtrip tests and a simple roundtrip integration test.

Relates to #49091
Backport of #53592
2020-03-20 13:15:58 +01:00
Alan Woodward d23112f441 Report parser name and location in XContent deprecation warnings (#53805)
It's simple to deprecate a field used in an ObjectParser just by adding deprecation
markers to the relevant ParseField objects. The warnings themselves don't currently
have any context - they simply say that a deprecated field has been used, but not
where in the input xcontent it appears. This commit adds the parent object parser
name and XContentLocation to these deprecation messages.

Note that the context is automatically stripped from warning messages when they
are asserted on by integration tests and REST tests, because randomization of
xcontent type during these tests means that the XContentLocation is not constant
2020-03-20 11:52:55 +00:00
Jason Tedor 4e6bbf6e3c
Execute retention lease syncs under system context (#53838)
The retention lease syncs need to occur under the system context,
because they are internal actions executed on behalf of the user. Today
we are relying on this happening for background syncs by virtue of the
fact that the context the syncs are created under is the system
context. This is due to these occurring on the cluster state applier
thread. However, there are situations where this does not hold such as
when a timed out cluster state publication occurs, and the node where
the shard is allocated is the elected master node. In that case, the
context will be empty due to the fact that we do not reschedule
publication under the system context. Currently, doing so runs us into
some troubles with losing the existing context, possibly dropping
deprecation headers. We could copy that context over when marking the
current context as the system context, but the implications of that
require some more investigation. For now, we explicitly mark the
retention lease syncs as executing under the system context, as this is
situation that we can reason about.
2020-03-20 07:36:12 -04:00
Ryan Ernst f7143b8d85 Fix Joda compatibility in stream protocol (#53823)
The JodaCompatibleZonedDateTime is a compatibility object that unions
Joda's DateTime and Java's ZonedDateTime, meant for use in scripts. When
it was added, we serialized the JCZDT as a Joda DateTime so that when
sending to older nodes they could still read the object. However, on
newer nodes, we continued also reading this as a Joda DateTime. This
commit changes the read side to form a JCZDT.

closes #53586
2020-03-19 16:39:20 -07:00
Lee Hinman c3dee628c7
[7.x] Add IndexTemplateV2 to MetaData (#53753) (#53827)
* Add IndexTemplateV2 to MetaData (#53753)

* Add IndexTemplateV2 to MetaData

This adds the `IndexTemplateV2` and `IndexTemplateV2Metadata` class to be used for the new
implementation of index templates. The new metadata is stored as a `MetaData.Custom` implementation.

Relates to #53101

* Add ITV2Metadata unit tests

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

* Update min supported version constant

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-03-19 15:04:00 -06:00
Mayya Sharipova 2c77c0df65 Fix testIndexhasDuplicateData tests (#49786)
testIndexHasDuplicateData tests were failing ocassionally,
due to approximate calculation of BKDReader.estimatePointCount,
where if the node is Leaf, the number of points in it
was (maxPointsInLeafNode + 1) / 2.
As DEFAULT_MAX_POINTS_IN_LEAF_NODE = 1024, for small indexes
used in tests, the estimation could be really off.

This rewrites tests, to make the  max points in leaf node to
be a small value to control the tests.

Closes #49703
2020-03-19 15:09:23 -04:00
Mark Vieira 3b2b564c91
Improve IntelliJ IDE integration (#53747)
This commit makes a number of improvements when importing the
Elasticsearch project into IntelliJ IDEA. Specifically:

- Contributing documentation has been updated to reflect that the
  'idea' task should no long be used and Gradle project import is
  instead the officially supported way of setting up the project.
- Attempts to run the 'idea' task will result in a failure with a
  message directing folks to our CONTRIBUTING.md document.
- The project JDK is explicit set rather that using whatever JAVA_HOME
  is.
- Gradle build operation delegation is disabled, and test execution is
  configured to 'choose per test'.
- Gradle is configured to inherit the project JDK.
- Some code style conventions are automatically configured.
- File encoding is explicitly set to UTF-8.
- Parallel module compilation is enabled and deprecated feature
  warnings are disabled.
- A remote debug run configuration using listen mode is created.
- JUnit runner is configured with required system properties.
- License headers are configured such that Apache 2 is the default
  notice added to all source files with exception of source in /x-pack
  which will use the Elastic license.
2020-03-19 11:43:33 -07:00
David Turner 7d3ac4f57d Revert "Apply cluster states in system context (#53785)"
This reverts commit 4178c57410.
2020-03-19 15:20:36 +00:00
David Turner 4178c57410 Apply cluster states in system context (#53785)
Today cluster states are sometimes (rarely) applied in the default context
rather than system context, which means that any appliers which capture their
contexts cannot do things like remote transport actions when security is
enabled.

There are at least two ways that we end up applying the cluster state in the
default context:

1. locally applying a cluster state that indicates that the master has failed
2. the elected master times out while waiting for a response from another node

This commit ensures that cluster states are always applied in the system
context.

Mitigates #53751
2020-03-19 14:48:55 +00:00
Ignacio Vera 4f1b2fd2b1
Add support for distance queries on geo_shape queries (#53466) (#53795)
With the upgrade to Lucene 8.5, LatLonShape field has support for distance queries. This change implements this new feature and removes the limitation.
2020-03-19 15:21:58 +01:00
Dominic Page b0884baf46
Geo shape query vs geo point backport (#53774)
Backport to 7x

Enable geo_shape query to work on geo_point fields for shapes: circle, polygon, multipolygon, rectangle see: #48928
Co-Authored-By:  @iverase
2020-03-19 13:00:36 +01:00
Jim Ferenczi 4b0ae15a9d Disable distributed sort optimization on scroll requests (#53759)
This commit disables the sort optimization added in #51852 for scroll requests.
Scroll queries keep a state per shard so we cannot modify the request on
the first round (submit).
This bug was introduced in non-released versions which is why this pr
is marked as a non-issue.
2020-03-19 08:11:23 +01:00
Mark Vieira 9b3b08318d
Remove unused import 2020-03-18 21:07:17 -07:00
Jason Tedor bc5dae2713
Fix compilation in RoutingNode
This commit fixes compilation in RoutingNode.java after a backport
brought back usage of an API not available in JDK 8.
2020-03-18 22:21:54 -04:00
Jason Tedor 90ab949415
Improve performance of shards limits decider (#53577)
On clusters with a large number of shards, the shards limits allocation
decider can exhibit poor performance leading to timeouts applying
cluster state updates. This occurs because for every shard, we do a loop
to count the number of shards on the node, and the number of shards for
the index of the shard. This is roughly quadratic in the number of
shards. This loop is not necessary, since we already have a O(1) method
to count the number of non-relocating shards on a node, and with this
commit we add some infrastructure to RoutingNode to make counting the
number of shards per index O(1).
2020-03-18 20:58:22 -04:00
Stuart Tettemer cdbee32f55
Scripting: Per-context script cache, default off (#52855) (#53756)
* Adds per context settings:
  `script.context.${CONTEXT}.cache_max_size` ~
  `script.cache.max_size`

  `script.context.${CONTEXT}.cache_expire` ~
  `script.cache.expire`

  `script.context.${CONTEXT}.max_compilations_rate` ~
  `script.max_compilations_rate`

* Context cache is used if:
  `script.max_compilations_rate=use-context`.  This
  value is dynamically updatable, so users can
  switch back to the general cache if desired.

* Settings for context caches take the first value
  that applies:
  1) Context specific settings if set, eg
     `script.context.ingest.cache_max_size`
  2) Correlated general setting is set to the non-default
     value, eg `script.cache.max_size`
  3) Context default

The reason for 2's inclusion is to allow an easy
transition for users who've customized their general
cache settings.

Using the general cache settings for the context caches
results in higher effective settings, since they are
multiplied across the number of contexts.  So a general
cache max size of 200 will become 200 * # of contexts.
However, this behavior it will avoid users snapping to a
value that is too low for them.

Backport of: #52855
Refs: #50152
2020-03-18 14:44:04 -06:00
Jim Ferenczi 8e17322b3a
Shortcut query phase using the results of other shards (#51852) (#53659)
This commit, built on top of #51708, allows to modify shard search requests based on informations collected on other shards. It is intended to speed up sorted queries on time-based indices. For queries that are only interested in the top documents.

This change will rewrite the shard queries to match none if the bottom sort value computed in prior shards is better than all values in the shard.
For queries that mix top documents and aggregations this change will reset the size of the top documents to 0 instead of rewriting to match none.
This means that we don't need to keep a search context open for this shard since we know in advance that it doesn't contain any competitive hit.
2020-03-18 17:20:35 +01:00
Nhat Nguyen 1615c4b379
Fix testKeepTranslogAfterGlobalCheckpoint (#53704)
Read the global checkpoint after flushed as we might advance it while flushing.

Closes #53505
2020-03-18 11:24:19 -04:00
Alan Woodward 580bc40c0c Make it possible to deprecate all variants of a ParseField with no replacement (#53722)
Sometimes we want to deprecate and remove a ParseField entirely, without replacement;
for example, the various places where we specify a _type field in 7x. Currently we can
tell users only that a particular field name should not be used, and that another name should
be used in its place. This commit adds the ability to say that a field should not be used at
all.
2020-03-18 14:16:19 +00:00
Marios Trivyzas d56dee599a
Increase step between checks for cancellation (#53712)
The introduction of the ExitableDirectoryReader showed increase of
latencies for range queries using pointvalues.

Check for cancellation every 1024 docs instead of every 15 to lower
the impact of the check in query's performance.

Follows: #52822
Fixes: #53496
(cherry picked from commit 6b5fc35e4458e60a7ca5822584ec6a60562f2c01)
2020-03-18 14:52:40 +01:00
Tanguy Leroux 6cc564d677 Restore off-heap loading for term dictionary in ReadOnlyEngine (#53713)
This is a partial restore of #43158, following decision taken in #51247

Closes #51247
2020-03-18 13:24:34 +01:00
Tianlun Li e7ae9ae596 Deprecate delaying state recovery for master nodes (#53646)
It is useful to be able to delay state recovery until enough data nodes have
joined the cluster, since this gives the shard allocator a decent opportunity
to re-use as much existing data as possible. However we also have the option to
delay state recovery until a certain number of master-eligible nodes have
joined, and this is unnecessary: we require a majority of master-eligible nodes
for state recovery, and there is no advantage in waiting for more.

This commit deprecates the unnecessary settings in preparation for their
removal.

Relates #51806
2020-03-18 10:04:22 +00:00
Lee Hinman 9c0e846db3
[7.x] Add REST API for ComponentTemplate CRUD (#53558) (#53681)
* Add REST API for ComponentTemplate CRUD

This adds the Put/Get/DeleteComponentTemplate APIs that allow inserting, retrieving, and removing
ComponentTemplateMetadata into the cluster state metadata.

These APIs are currently only available behind a feature flag system property -
`es.itv2_feature_flag_registered`.

Relates to #53101

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-03-17 13:23:28 -06:00
Ryan Ernst 5c472fcb47 Upgrade jackson to 2.10.3 and GeoIP to 2.13.1 (#53642)
Re-applies the change from #53523 along with test fixes.

closes #53626
closes #53624
closes #53622
closes #53625

Co-authored-by: Nik Everett <nik9000@gmail.com>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>
Co-authored-by: Jake Landis <jake.landis@elastic.co>
2020-03-17 10:28:51 -07:00
Alan Woodward 71b703edd1 Rename AtomicFieldData to LeafFieldData (#53554)
This conforms with lucene's LeafReader naming convention, and
matches other per-segment structures in elasticsearch.
2020-03-17 12:30:12 +00:00
Jason Tedor 01d2339883
Invoke response handler on failure to send (#53631)
Today it can happen that a transport message fails to send (for example,
because a transport interceptor rejects the request). In this case, the
response handler is never invoked, which can lead to necessary cleanups
not being performed. There are two ways to handle this. One is to expect
every callsite that sends a message to try/catch these exceptions and
handle them appropriately. The other is merely to invoke the response
handler to handle the exception, which is already equipped to handle
transport exceptions.
2020-03-16 21:28:24 -04:00
Jason Tedor 881d0bfa8a
Add server name to remote info API (#53634)
This commit adds the configured server_name to the proxy mode info so
that it can be exposed in the remote info API.
2020-03-16 21:20:42 -04:00
Luca Cavanna c3d2417448
Cumulative backport of async search changes (#53635)
* Submit async search to work only with POST (#53368)

Currently the submit async search API can be called using both GET and POST at REST, but given that it submits a call and creates internal state, POST should be the only allowed method.

* Refine SearchProgressListener internal API (#53373)

The following cumulative improvements have been made:
- rename `onReduce` and `notifyReduce` to `onFinalReduce` and `notifyFinalReduce`
- add unit test for `SearchShard`
- on* methods in `SearchProgressListener` shouldn't need to be public as they should never be called directly, they only need to be overridden hence they can be made protected. They are actually called directly from a test which required some adapting, like making `AsyncSearchTask.Listener` class package private instead of private
- Instead of overriding `getProgressListener` in `AsyncSearchTask`, as it feels weird to override a getter method, added a specific method that allows to retrieve the Listener directly without needing to cast it. Made the getter and setter for the listener final in the base class.
- rename `SearchProgressListener#searchShards` methods to `buildSearchShards` and make it static given that it accesses no instance members
- make `SearchShard` and `SearchShardTask` classes final

* Move async search yaml tests to x-pack yaml test folder (#53537)

The yaml tests for async search currently sit in its qa folder. There is no reason though for them to live in a separate folder as they don't require particular setup. This commit moves them to the main folder together with the other x-pack yaml tests so that they will be run by the client test runners too.

* [DOCS] Add temporary redirect for async-search (#53454)

The following API spec files contain a link to a not-yet-created
async search docs page:

* [async_search.delete.json][0]
* [async_search.get.json][1]
* [async_search.submit.json][2]

The Elaticsearch-js client uses these spec files to create their docs.
This created a broken link in the Elaticsearch-js docs, which has broken
the docs build.

This PR adds a temporary redirect for the docs page. This redirect
should be removed when the actual API docs are added.

[0]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.delete.json
[1]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.get.json
[2]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.submit.json

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2020-03-17 00:08:17 +01:00
Nik Everett 9845dbb7d6
Fix sorting agg buckets by doc_count (backport of #53617) (#53627)
I broke sorting aggregations by `doc_count` in #51271 by mixing up true
and false. This flips that comparison and adds a few tests to double
check that we don't so this again.
2020-03-16 17:35:43 -04:00
Nik Everett f0beab4041
Stop using round-tripped PipelineAggregators (backport of #53423) (#53629)
This begins to clean up how `PipelineAggregator`s and executed.
Previously, we would create the `PipelineAggregator`s on the data nodes
and embed them in the aggregation tree. When it came time to execute the
pipeline aggregation we'd use the `PipelineAggregator`s that were on the
first shard's results. This is inefficient because:
1. The data node needs to make the `PipelineAggregator` only to
   serialize it and then throw it away.
2. The coordinating node needs to deserialize all of the
   `PipelineAggregator`s even though it only needs one of them.
3. You end up with many `PipelineAggregator` instances when you only
   really *need* one per pipeline.
4. `PipelineAggregator` needs to implement serialization.

This begins to undo these by building the `PipelineAggregator`s directly
on the coordinating node and using those instead of the
`PipelineAggregator`s in the aggregtion tree. In a follow up change
we'll stop serializing the `PipelineAggregator`s to node versions that
support this behavior. And, one day, we'll be able to remove
`PipelineAggregator` from the aggregation result tree entirely.

Importantly, this doesn't change how pipeline aggregations are declared
or parsed or requested. They are still part of the `AggregationBuilder`
tree because *that* makes sense.
2020-03-16 16:15:23 -04:00
Gordon Brown 031932b32f
Allow _cat indices & aliases to use indices options (#53248)
This commit adjusts the _cat/indices and _cat/aliases APIs to allow
specifying indices options, so that these APIs can handle hidden
indices/aliases in the same way as other APIs.

Also adds the hidden option to the expand_wildcards parameter
in the YAML spec for every API that accepts it.
2020-03-16 11:25:05 -06:00
markharwood 2c74f3e22c
Backport of new wildcard field type (#53590)
* New wildcard field optimised for wildcard queries (#49993)

Indexes values using size 3 ngrams and also stores the full original as a binary doc value.
Wildcard queries operate by using a cheap approximation query on the ngram field followed up by a more expensive verification query using an automaton on the binary doc values.  Also supports aggregations and sorting.
2020-03-16 15:07:13 +00:00
Mayya Sharipova a906f8a0e4
Highlighters skip ignored keyword values (#53408) (#53604)
Keyword field values with length more than ignore_above are not
indexed. But highlighters still were retrieving these values
from _source and were trying to highlight them. This sometimes lead to
errors if a field length exceeded  max_analyzed_offset. But also this
is an overall wrong behaviour to attempt to highlight something that was
ignored during indexing.

This PR checks if a keyword value was ignored because of its length,
and if yes, skips highlighting it.

Backport: #53408
Closes #43800
2020-03-16 11:06:25 -04:00
Jim Ferenczi e6680be0b1
Add new x-pack endpoints to track the progress of a search asynchronously (#49931) (#53591)
This change introduces a new API in x-pack basic that allows to track the progress of a search.
Users can submit an asynchronous search through a new endpoint called `_async_search` that
works exactly the same as the `_search` endpoint but instead of blocking and returning the final response when available, it returns a response after a provided `wait_for_completion` time.

````
GET my_index_pattern*/_async_search?wait_for_completion=100ms
{
  "aggs": {
    "date_histogram": {
      "field": "@timestamp",
      "fixed_interval": "1h"
    }
  }
}
````

If after 100ms the final response is not available, a `partial_response` is included in the body:

````
{
  "id": "9N3J1m4BgyzUDzqgC15b",
  "version": 1,
  "is_running": true,
  "is_partial": true,
  "response": {
   "_shards": {
       "total": 100,
       "successful": 5,
       "failed": 0
    },
    "total_hits": {
      "value": 1653433,
      "relation": "eq"
    },
    "aggs": {
      ...
    }
  }
}
````

The partial response contains the total number of requested shards, the number of shards that successfully returned and the number of shards that failed.
It also contains the total hits as well as partial aggregations computed from the successful shards.
To continue to monitor the progress of the search users can call the get `_async_search` API like the following:

````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms
````

That returns a new response that can contain the same partial response than the previous call if the search didn't progress, in such case the returned `version`
should be the same. If new partial results are available, the version is incremented and the `partial_response` contains the updated progress.
Finally if the response is fully available while or after waiting for completion, the `partial_response` is replaced by a `response` section that contains the usual _search response:

````
{
  "id": "9N3J1m4BgyzUDzqgC15b",
  "version": 10,
  "is_running": false,
  "response": {
     "is_partial": false,
     ...
  }
}
````

Asynchronous search are stored in a restricted index called `.async-search` if they survive (still running) after the initial submit. Each request has a keep alive that defaults to 5 days but this value can be changed/updated any time:
`````
GET my_index_pattern*/_async_search?wait_for_completion=100ms&keep_alive=10d
`````
The default can be changed when submitting the search, the example above raises the default value for the search to `10d`.
`````
GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms&keep_alive=10d
`````
The time to live for a specific search can be extended when getting the progress/result. In the example above we extend the keep alive to 10 more days.
A background service that runs only on the node that holds the first primary shard of the `async-search` index is responsible for deleting the expired results. It runs every hour but the expiration is also checked by running queries (if they take longer than the keep_alive) and when getting a result.

Like a normal `_search`, if the http channel that is used to submit a request is closed before getting a response, the search is automatically cancelled. Note that this behavior is only for the submit API, subsequent GET requests will not cancel if they are closed.

Asynchronous search are not persistent, if the coordinator node crashes or is restarted during the search, the asynchronous search will stop. To know if the search is still running or not the response contains a field called `is_running` that indicates if the task is up or not. It is the responsibility of the user to resume an asynchronous search that didn't reach a final response by re-submitting the query. However final responses and failures are persisted in a system index that allows
to retrieve a response even if the task finishes.

````
DELETE _async_search/9N3J1m4BgyzUDzqgC15b
````

The response is also not stored if the initial submit action returns a final response. This allows to not add any overhead to queries that completes within the initial `wait_for_completion`.

The `.async-search` index is a restricted index (should be migrated to a system index in +8.0) that is accessible only through the async search APIs. These APIs also ensure that only the user that submitted the initial query can retrieve or delete the running search. Note that admins/superusers would still be able to cancel the search task through the task manager like any other tasks.

Relates #49091

Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>
2020-03-16 15:31:27 +01:00
David Turner 7e82a4f78c Do not log no-op reconnections at DEBUG (#53469)
Today the NodeConnectionsService emits a DEBUG-level log message each time it
calls TransportService#connectToNode, which happens for every node in the
cluster every ten seconds, and also at every cluster state update. That's a lot
of log messages. Most of these calls are no-ops and can be ignored, but if the
call was not a no-op then it may be worth investigating further. Since the logs
do not distinguish the interesting and uninteresting cases, they are not
useful.

This commit distinguishes the two cases and pushes the noisy logging for the
common no-op case down to TRACE level, leaving only useful and actionable
information in the DEBUG-level logs.
2020-03-16 08:56:20 +00:00
Mark Vieira 2f0aca992b
Revert "Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576)"
This reverts commit b7dbadeea0.
2020-03-15 18:10:40 -07:00
Jason Tedor 66374b61ca
Remove extra code in allocation commands parsing (#53579)
This commit removes some code that is duplicated in the parsing of
allocation commands in the cluster reroute API.
2020-03-14 18:14:13 -04:00
Jason Tedor b7dbadeea0
Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576)
This commit upgrades our Jackson dependency to 2.10.3 and our GeoIP2
dependency to 2.13.1.

Relates #53523
2020-03-14 13:28:06 -04:00
Marios Trivyzas b6c94fd73e
Fix Term Vectors with artificial docs and keyword fields (#53504) (#53550)
Previously, Term Vectors API was returning empty results for
artificial documents with keyword fields. Checking only for `string()`
on `IndexableField` is not enough, since for `KeywordFieldType`
`binaryValue()` must be used instead.

Fixes #53494

(cherry picked from commit 1fc3fe3d32f41eab2101c0536751b7c47e63cc48)
2020-03-13 19:26:14 +01:00
Dan Hermann fb29c2dccf
Fix ingest pipeline _simulate api with empty docs never returns a res… (#52937) (#53547) 2020-03-13 09:41:14 -05:00
William Brafford 5b718d2565
Use snake case for nodes stats/info metric names (#53446) (#53535)
* Use snake case for nodes stats/info metric names (#53446)

The REST API uses "thread_pool" as the name of the thread pool metric.
If we use this name internally when we serialize nodes stats and info
requests, we won't need to do any fancy logic to check for and switch
out "threadPool", which was the previous internal name.
2020-03-13 07:49:14 -04:00
Jim Ferenczi 9dfcc07401 Fix pre-sorting of shards in the can_match phase (#53397)
This commit fixes a bug on sorted queries with a primary sort field
that uses different types in the requested indices. In this scenario
the returned min/max values to sort the shards are not comparable so
we should avoid the sorting rather than throwing an obscure exception.
2020-03-13 01:28:11 +01:00
Nhat Nguyen fe2f6b359e Fix concurrent requests race over scroll context limit (#53449)
Concurrent search scroll requests can lead to more scroll contexts than the limit.
2020-03-12 17:56:51 -04:00
Lee Hinman 2789fe4179
[7.x] Add ComponentTemplate to MetaData (#53290) (#53489)
* Add ComponentTemplate to MetaData (#53290)

* Add ComponentTemplate to MetaData

This adds a `ComponentTemplate` datastructure that will be used as part of #53101 (Index Templates
v2) to the `MetaData` class. Currently there are no APIs for interacting with this class, so it will
always be an empty map (other than in tests). This infrastructure will be built upon to add APIs in
a subsequent commit.

A `ComponentTemplate` is made up of a `Template`, a version, and a MetaData.Custom class. The
`Template` contains similar information to an `IndexTemplateMetaData` object— settings, mappings,
and alias configuration.

* Update minimal supported version constant

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-03-12 15:33:32 -06:00
Nik Everett 9dcd64c110
Preserve metric types in top_metrics (backport of #53288) (#53440)
This changes the `top_metrics` aggregation to return metrics in their
original type. Since it only supports numerics, that means that dates,
longs, and doubles will come back as stored, with their appropriate
formatter applied.
2020-03-12 17:17:09 -04:00
Jay Modi 0353b804bf
Mute testKeepTranslogAfterGlobalCheckpoint (#53510)
This change mutes a test that fails reproducibly in
InternalEngineTests.

Relates #53505
2020-03-12 13:08:13 -06:00
Lee Hinman 67fffe676e
[7.x] Add read/writeOptionalVLong to StreamInput/Output (#5314… (#53491)
The spirit of StreamInput/StreamOutput is that common I/O patterns should
be handled by these classes so that the persistence methods in
application classes can be kept short, which facilitates easy visual
comparison between read and write methods, and reduces risks of having
serialization issues due to mismatched implementations.

To this end, this change adds readOptionalVLong and writeOptionalVLong
methods to these classes as we have started to build up cases where
that conditional/null logic has been implemented directly in the read &
write methods.

Co-authored-by: Tim Vernum <tim.vernum@elastic.co>
2020-03-12 10:59:31 -06:00
Przemyslaw Gomulka 2438b899eb
Support joda style date patterns in 7.x (#52555)
If an index was created in version 6 and contain a date field with a joda-style pattern it should still be allowed to search and insert document into it.
Those created in 6 but date pattern starts with 8, should be considered as java style.
2020-03-12 08:57:03 +01:00
Nik Everett 9ada508347
Fix date_nanos in composite aggs (backport of #53315) (#53347)
It looks like `date_nanos` fields weren't likely to work properly in
composite aggs because composites iterate field values using points and
we weren't converting the points into milliseconds. Because the doc
values were coming back in milliseconds we ended up geting very confused
and just never collecting sub-aggregations.

This fixes that by adding a method to `DateFieldMapper.Resolution` to
`parsePointAsMillis` which is similarly in name and function to
`NumberFieldMapper.NumberType`'s `parsePoint` except that it normalizes
to milliseconds which is what aggs need at the moment.

Closes #53168
2020-03-11 13:00:07 -04:00
Nhat Nguyen 1fd56698fa Adjust wire version for search context id
Relates #53143
2020-03-11 11:48:11 -04:00
Nhat Nguyen 6665ebe7ab Harden search context id (#53143)
Using a Long alone is not strong enough for the id of search contexts
because we reset the id generator whenever a data node is restarted.
This can lead to two issues:

1. Fetch phase can fetch documents from another index
2. A scroll search can return documents from another index

This commit avoids these issues by adding a UUID to SearchContexId.
2020-03-11 11:48:11 -04:00
David Turner ac721938c2 Allow joining node to trigger term bump (#53338)
In rare circumstances it is possible for an isolated node to have a greater
term than the currently-elected leader. Today such a node will attempt to join
the cluster but will not offer a vote to the leader and will reject its cluster
state publications due to their stale term. This situation persists since there
is no mechanism for the joining node to inform the leader that its term is
stale and a new election is required.

This commit adds the current term of the joining node to the join request. Once
the join has been validated, the leader will perform another election to
increase its term far enough to allow the isolated node to join properly.

Fixes #53271
2020-03-11 09:19:44 +00:00
Armin Braun 7189c57b6c
Record Force Merges in Live Commit Data (#52694) (#53372)
* Record Force Merges in live commit data

Prerequisite of #52182. Record force merges in the live commit data
so two shard states with the same sequence number that differ only in whether
or not they have been force merged can be distinguished when creating snapshots.
2020-03-11 06:30:36 +01:00
Nhat Nguyen 24f114766f Fix doc_stats and segment_stats of ReadOnlyEngine (#53345)
We can't always have the same segment stats and doc stats between
InternalEngine and ReadOnlyEngine if there are some fully deleted
segments. ReadOnlyEngine always filters out them. InternalEngine,
however, will keep them if peer recovery retention leases exist or the
number of the retaining operations is non-zero.

This change reverts the fix in #51331 and uses the wrapped reader to
calculate the segment stats and doc stats. For the test, we need to
disable the extra retaining soft-deletes operations.

Closes #51303
2020-03-10 21:51:33 -04:00
Gordon Brown 20bbe5bae4
Fix Rollover handing of hidden aliases (#53146)
Prior to this commit, rollover did not propagate the `is_hidden` alias
property when rollover over an index. This commit ensures that an alias
that's rollover over will remain hidden.
2020-03-10 10:56:12 -06:00
Nik Everett 5ce6de2c1a
Simplify SiblingPipelineAggregator (#53144) (#53341)
This removes the `instanceof`s from `SiblingPipelineAggregator` by
adding a `rewriteBuckets` method to `InternalAggregation` that can be
called to, well, rewrite the buckets. The default implementation of
`rewriteBuckets` throws the same exception that was thrown when you
attempted to run a `SiblingPipelineAggregator` on an aggregation without
buckets. It is overridden by `InternalSingleBucketAggregation` and
`InternalMultiBucketAggregation` to correctly rewrite their buckets.
2020-03-10 11:39:10 -04:00
Nik Everett 89c0e1f566
Fix composite agg sort bug (backport of #53296) (#53337)
When an composite aggregation is run against an index with a sort that
*starts* with the "source" fields from the composite but has additional
fields it'd blow up in while trying to decide if it could use the sort.
This changes it to decide that it *can* use the sort.

Closes #52480
2020-03-10 11:32:46 -04:00
Jim Ferenczi ae6c25b749 Speed up partial reduce of terms aggregations (#53216)
This change optimizes the merge of terms aggregations by removing
the priority queue that was used to collect all the buckets during
a non-final reduction. We don't need to keep the result sorted since
the merge of buckets in a subsequent reduce can modify the order.
I wrote a small micro-benchmark to test the change and the speed ups
are significative for small merge buffer sizes:

````
########## Master:
Benchmark                           (bufferSize)  (cardinality)  (numShards)  (topNSize)  Mode  Cnt     Score     Error  Units
TermsReduceBenchmark.reduceTopHits             5          10000         1000        1000  avgt   10  2459,690 ± 198,682  ms/op
TermsReduceBenchmark.reduceTopHits            16          10000         1000        1000  avgt   10  1030,620 ±  91,544  ms/op
TermsReduceBenchmark.reduceTopHits            32          10000         1000        1000  avgt   10   558,608 ±  44,915  ms/op
TermsReduceBenchmark.reduceTopHits           128          10000         1000        1000  avgt   10   287,333 ±   8,342  ms/op
TermsReduceBenchmark.reduceTopHits           512          10000         1000        1000  avgt   10   257,325 ±  54,515  ms/op

########## Patch:
Benchmark                           (bufferSize)  (cardinality)  (numShards)  (topNSize)  Mode  Cnt    Score    Error  Units
TermsReduceBenchmark.reduceTopHits             5          10000         1000        1000  avgt   10  805,611 ± 14,630  ms/op
TermsReduceBenchmark.reduceTopHits            16          10000         1000        1000  avgt   10  378,851 ± 17,929  ms/op
TermsReduceBenchmark.reduceTopHits            32          10000         1000        1000  avgt   10  261,094 ± 10,176  ms/op
TermsReduceBenchmark.reduceTopHits           128          10000         1000        1000  avgt   10  241,051 ± 19,558  ms/op
TermsReduceBenchmark.reduceTopHits           512          10000         1000        1000  avgt   10  231,643 ±  6,170  ms/op
````

The code for the benchmark can be found [here](). It seems to be up to 3x faster for terms aggregations
that return 10,000 unique terms (1000 terms per shard). For a cardinality of 100,000 terms, this patch is up to 5x faster:

````
########## Patch:
Benchmark                           (bufferSize)  (cardinality)  (numShards)  (topNSize)  Mode  Cnt      Score     Error  Units
TermsReduceBenchmark.reduceTopHits             5         100000         1000        1000  avgt   10  12791,083 ± 397,128  ms/op
TermsReduceBenchmark.reduceTopHits            16         100000         1000        1000  avgt   10   3974,939 ± 324,617  ms/op
TermsReduceBenchmark.reduceTopHits            32         100000         1000        1000  avgt   10   2186,285 ± 267,124  ms/op
TermsReduceBenchmark.reduceTopHits           128         100000         1000        1000  avgt   10    914,657 ± 160,784  ms/op
TermsReduceBenchmark.reduceTopHits           512         100000         1000        1000  avgt   10    604,198 ± 145,457  ms/op

########## Master:
Benchmark                           (bufferSize)  (cardinality)  (numShards)  (topNSize)  Mode  Cnt      Score     Error  Units
TermsReduceBenchmark.reduceTopHits             5         100000         1000        1000  avgt   10  60696,107 ± 929,944  ms/op
TermsReduceBenchmark.reduceTopHits            16         100000         1000        1000  avgt   10  16292,894 ± 783,398  ms/op
TermsReduceBenchmark.reduceTopHits            32         100000         1000        1000  avgt   10   7705,444 ±  77,588  ms/op
TermsReduceBenchmark.reduceTopHits           128         100000         1000        1000  avgt   10   2156,685 ±  88,795  ms/op
TermsReduceBenchmark.reduceTopHits           512         100000         1000        1000  avgt   10    760,273 ±  53,738  ms/op
````

The merge of buckets can also be optimized. Currently we use an hash map to merge buckets coming from different shards so this can be costly if the number of unique terms is high. Instead, we could always sort the shard terms result by key and perform a merge sort to reduce the results. This would save memory and make the merge more linear in terms
of complexity in the coordinating node at the expense of an additional sort in the shards. I plan to test this possible optimization in a follow up.

Relates #51857
2020-03-10 14:26:59 +01:00
Nik Everett e23c3f915f
Save a little space on empty BitArrays (#53243) (#53316)
It doesn't make a whole lot of sense for `BitArray#clear` to grow the
underlying storage array just to clear the bit. We *already* treat
indices outside of the storage array as unset. This turns such
operations into a noop.
2020-03-10 09:22:19 -04:00
Alan Woodward 5c861cfe6e Upgrade to final lucene 8.5.0 snapshot (#53293)
Lucene 8.5.0 release candidates are imminent. This commit upgrades master to use
the latest snapshot to check that there are no last-minute bugs or regressions.
2020-03-10 09:32:59 +00:00
Gordon Brown 1cb0a4399d
Fix Get Alias API handling of hidden indices with visible aliases (#53147)
This commit changes the Get Aliases API to include hidden indices by
default - this is slightly different from other APIs, but is necessary
to make this API work intuitively.
2020-03-09 16:16:29 -06:00
William Brafford 2bb4b96a7f
Serialize NodesStatsRequest as set of strings (#53235) (#53313)
* Add unit tests before refactoring

* Convert boolean fields to set of strings

In order to make nodes stats plugins pluggable, we need to make the
NodesStatsRequest class capable of carrying a flexible list of metrics
rather than a fixed list of boolean flags. This commit changes the
internal storage of the class without changing its serialization.

* Change serialization of NodesStatsRequest

* Set up BWC before merging

* Singularize enum name
2020-03-09 18:13:29 -04:00
Jason Tedor 1860c57147
Deprecate the listener thread pool (#53266)
The listener thread pool is being removed from use in the server
codebase. This commit deprecates configuring the listener thread pool.
2020-03-09 16:56:01 -04:00
David Turner b20f86e450 Clarify JavaDoc for DiscoveryNodes#resolveNodes (#53277)
Closes #52887
2020-03-09 14:44:29 +00:00
David Turner 52ff341814 Deprecate passing settings in restore requests (#53268)
Today we accept a `settings` field in snapshot restore requests, but this field
is not used. This commit deprecates it.
2020-03-09 12:01:07 +00:00
Christoph Büscher 2fd954a3b7 Fix potential NPE in FuzzyTermsEnum (#53231)
Under certain circumstances SpanMultiTermQueryWrapper uses
SpanBooleanQueryRewriteWithMaxClause as its rewrite method, which in turn tries
to get a TermsEnum from the wrapped MultiTermQuery currently using a `null`
AttributeSource. While queries TermsQuery or subclasses of AutomatonQuery ignore
this argument, FuzzyQuery uses it to create a FuzzyTermsEnum which triggers an
NPE when the AttributeSource is not provided. This PR fixes this by supplying an
empty AttributeSource instead of a `null` value.

Closes #52894
2020-03-09 12:59:08 +01:00
Jason Tedor 5e96d3e59a
Use given executor for global checkpoint listener (#53260)
Today when notifying a global checkpoint listener, we use the listener
thread pool. This commit turns this inside out so that the global
checkpoint listener must provide an executor on which to notify the
listener.
2020-03-08 13:51:05 -04:00
Jason Tedor 79b67eb3ba
Drop action future that forks on listener executor (#53261)
This commit drops the dispatching listenable action future that forks to
the listener thread pool. This was previously used in the transport
client but is no longer used.
2020-03-08 12:36:09 -04:00
Jason Tedor a0b235888f
Avoid self-suppression on grouped action listener (#53262)
It can be that a failure is repeated to a grouped action listener. For
example, if the same exception such as a connect transport exception, is
the cause of repeated failures. Previously we were unconditionally
self-suppressing the exception into the first exception, but
self-supressing is not allowed. Thus, we would throw an exception and
the grouped action listener would never complete. This commit addresses
this by guarding against self-suppression.
2020-03-08 08:59:57 -04:00
Jason Tedor c5738ae312
Notify refresh listeners on the calling thread (#53259)
Today we notify refresh listeners by forking to the listener thread pool
and then serially notifying listeners on a thread there. Refreshes are
expensive though, so the expectation is that we are executing refreshes
on threads that can afford an expensive operation (e.g., not a network
thread) and as such, executing listeners that we expect to be cheap aon
the calling thread is okay. This commit removes the forking of notifying
refresh listeners to run directly on the calling thread that executed a
refresh.
2020-03-07 13:12:40 -05:00
Gordon Brown ff9b8bda63
Implement hidden aliases (#52547)
This commit introduces hidden aliases. These are similar to hidden
indices, in that they are not visible by default, unless explicitly
specified by name or by indicating that hidden indices/aliases are
desired.

The new alias property, `is_hidden` is implemented similarly to
`is_write_index`, except that it must be consistent across all indices
with a given alias - that is, all indices with a given alias must
specify the alias as either hidden, or all specify it as non-hidden,
either explicitly or by omitting the `is_hidden` property.
2020-03-06 16:02:38 -07:00
Nik Everett 7c9641ef9d
Simplify BucketedSort (#53199) (#53240)
Our lovely `BitArray` compactly stores "flags", lazilly growing its
underlying storage. It is super useful when you need to store one bit of
data for a zillion buckets or a documents or something. Usefully, it
defaults to `false`. But there is a wrinkle! If you ask it whether or
not a bit is set but it hasn't grown its underlying storage array
"around" that index then it'll throw an `ArrayIndexOutOfBoundsException`.
The per-document use cases tend to show up in order and don't tend to
mind this too much. But the use case in aggregations, the per-bucket use
case, does. Because buckets are collected out of order all the time.

This changes `BitArray` so it'll return `false` if the index is too big
for the underlying storage. After all, that index *can't* have been set
or else we would have grown the underlying array. Logically, I believe
this makes sense. And it makes my life easy. At the cost of three lines.

*but* this adds an extra test to every call to `get`. I think this is
likely ok because it is "very close" to an array index lookup that
already runs the same test. So I *think* it'll end up merged with the
array bounds check.
2020-03-06 15:27:51 -05:00
Jay Modi a81460dbf5
Make watch history indices hidden (#52974)
This commit updates the template used for watch history indices with
the hidden index setting so that new indices will be created as hidden.

Relates #50251
Backport of #52962
2020-03-06 09:47:03 -07:00
Christoph Büscher 9e561c2921 Fix AbstractBulkByScrollRequest slices parameter via Rest (#53068)
Currently the AbstractBulkByScrollRequest accepts slice values of 0 via its
`setSlices` method, denoting the "auto" slicing behaviour that is usable by
settting the "slices=auto" parameter on rest requests. When using the High Level
Rest Client, however, we send the 0 value as an integer, which is then rejected
as invalid by `AbstractBulkByScrollRequest#parseSlices`. Instead of making
parsing of the rest request more lenient, this PR opts for changing the
RequestConverter logic in the client to translate 0 values to "auto" on the rest
requests.

Closes #53044
2020-03-06 15:38:04 +01:00
William Brafford d145b5536f
Serialize NodesInfoRequest as a set of strings (#53140) (#53202)
For Node Info to be pluggable, NodesInfoRequest must be able to carry
arbitrary strings. This commit reworks the internals of that class to
use a set rather than hard-coded boolean fields.

NodesInfoRequest defaults to specifying all values. We test for
this behavior as we refactor and use random testing for the
various combinations of metrics.

Add backwards compatibility for transport requests.
2020-03-06 09:07:49 -05:00
Marios Trivyzas 7ddbda4c20
Check for query cancellation during rewrite (#53166) (#53203)
With ExitableDirectoryReader in place, check for query cancellation
during QueryPhase#preProcess where the query rewriting takes place.

Follows: #52822

(cherry picked from commit 0d38626d8e6e9e2620a7a446b617a2ac42852461)
2020-03-06 11:04:01 +01:00
Alan Woodward c204137451 Deprecate BoolQueryBuilder's mustNot field (#53125)
The bool query builder in elasticsearch accepts both must_not and mustNot
fields. Given that leniency is abhorrent and must be eschewed, we should deprecate
the latter as it doesn't fit with the style of parameters elsewhere in the DSL.
2020-03-06 09:11:34 +00:00
Henning Andersen 2e924e4a83 Fix ClusterDisruptionIT.testAckedIndexing (#53169)
Use assertBusy when doing reroute after bridged disruption,
since it can return non-acked if a node is marked faulty
by follower check after disruption ended.

Closes #53064
2020-03-06 08:56:55 +01:00
Nhat Nguyen 5476a49833 Revert "upgrade to lucene-snapshot-fa75139efea (#53150) (#53151)"
This reverts commit 058113aa42.
2020-03-05 17:33:00 -05:00
Nhat Nguyen d456e8ffca Revert "Mute InternalEngineTests.testVersionOnPrimaryWithConcurrentRefresh"
This reverts commit 66788afa67.
2020-03-05 17:32:18 -05:00
Nhat Nguyen e9e209ae58 Revert "Mute InternalEngineTests.testRandomOperations"
This reverts commit d1cc2e68d5.
2020-03-05 17:32:11 -05:00
Nhat Nguyen dc78cc6131 Revert "Mute InternalEngineTests.testForceMergeWithSoftDeletesRetentionAndRecoverySource"
This reverts commit da8aac9e66.
2020-03-05 17:31:56 -05:00
Nhat Nguyen f11ae5fd14 Revert "Mute GatewayMetaStatePersistedStateTests.testDataOnlyNodePersistence"
This reverts commit 4452addf10.
2020-03-05 17:31:38 -05:00
James Baiera 4452addf10 Mute GatewayMetaStatePersistedStateTests.testDataOnlyNodePersistence 2020-03-05 16:44:03 -05:00
James Baiera da8aac9e66 Mute InternalEngineTests.testForceMergeWithSoftDeletesRetentionAndRecoverySource 2020-03-05 15:55:50 -05:00
James Baiera d1cc2e68d5 Mute InternalEngineTests.testRandomOperations 2020-03-05 15:09:47 -05:00
James Baiera 66788afa67 Mute InternalEngineTests.testVersionOnPrimaryWithConcurrentRefresh 2020-03-05 15:09:47 -05:00
Mayya Sharipova 7e2a9f58ee
script_score query errors on negative scores (#53133)
7.5 and 7.6 had a regression that allowed for
script_score queries to have negative scores.
We have corrected this regression in #52478.
This is an addition to #52478 that adds
a test and release notes.
2020-03-05 14:23:39 -05:00
Marios Trivyzas 487d442760
Implement Exitable DirectoryReader (#52822) (#53162)
Implement an Exitable DirectoryReader that wraps the original
DirectoryReader so that when a search task is cancelled the
DirectoryReaders also stop their work fast. This is usuful for
expensive operations like wilcard/prefix queries where the
DirectoryReaders can spend lots of time and consume resources,
as previously their work wouldn't stop even though the original
search task was cancelled (e.g. because of timeout or dropped client
connection).

(cherry picked from commit 67acaf61f33bc5f54e26541514d07e375c202e03)
2020-03-05 14:17:31 +01:00
Nik Everett 28df7ae5ed
Support multiple metrics in `top_metrics` agg (backport of #52965) (#53163)
This adds support for returning multiple metrics to the `top_metrics`
agg. It looks like:
```
POST /test/_search?filter_path=aggregations
{
  "aggs": {
    "tm": {
      "top_metrics": {
        "metrics": [
          {"field": "v"},
          {"field": "m"}
        ],
        "sort": {"s": "desc"}
      }
    }
  }
}
```
2020-03-05 08:12:01 -05:00
Alan Woodward 3cd4b97618 Remove UnknownNamedObjectException (#53105)
This was originally thrown from NamedXContentRegistry#parseNamedObject() but
that method now throws a NamedObjectNotFoundException, so this is unused.
2020-03-05 10:06:59 +00:00
Ignacio Vera 058113aa42
upgrade to lucene-snapshot-fa75139efea (#53150) (#53151) 2020-03-05 10:04:05 +01:00
Nik Everett 302980e0c4
Remove some ceremony in agg parsing (#53078) (#53117)
With #50871 aggrgations should now be parsed directly by an
`ObjectParser` or `ConstructingObjectParser` without the need for the
ceremonial `parse` method. This removes 9 of those `parse` methods and
parses the aggregation directly from their `ObjectParser`.
2020-03-04 13:06:41 -05:00
Tim Brooks f68917160e
Fix RemoteConnectionManager size() method (#52823)
Currently the remote connection manager will delegate the size() call to
the underlying cluster connection manager. This introduces the
possibility that call will return 1 before the nodeConnection method has
been triggered to add the connection to the remote connection list. This
can cause issues, as the ensureConnected method checks the connection
managers size and executes synchronously if the size is > 0. This leads
to a potential cluster not connected exception while we are still
waiting for the connection opened callback to be triggered.

This commit fixes this issue by using the remote connection manager's
size to report the connection manager's size.

Fixes #52029.
2020-03-04 09:53:22 -07:00
Yannick Welsch 8ab74fea58
[7.x] Add 7.6.2 as version (#53114) 2020-03-04 10:39:09 -06:00
Jake Landis f08ed1f69a
[7.x] add 6.8.8 as version (#53021) 2020-03-04 10:38:07 -06:00
Alan Woodward dfebbbf862 BoolQueryBuilder uses ObjectParser (#52880)
This commit removes the hand-rolled x-content parsing logic from BoolQueryBuilder
and instead uses an ObjectParser to handle parsing. It also removes the long-deprecated
(since version 6) disable_coord parameter.
2020-03-04 15:48:38 +00:00
Zachary Tong 3fcf598b92 Reduce deprecation log noise from DateIntervalWrapper (#52655)
Converts the deprecations to `deprecatedAndMaybeLog` to reduce the
number of times we log deprecations, since some of these could be called
at a high frequency (due to unconverted queries, aggs, etc)
2020-03-03 17:08:10 -05:00
Jay Modi c610e0893d
Introduce system index APIs for Kibana (#53035)
This commit introduces a module for Kibana that exposes REST APIs that
will be used by Kibana for access to its system indices. These APIs are wrapped
versions of the existing REST endpoints. A new setting is also introduced since
the Kibana system indices' names are allowed to be changed by a user in case
multiple instances of Kibana use the same instance of Elasticsearch.

Additionally, the ThreadContext has been extended to indicate that the use of
system indices may be allowed in a request. This will be built upon in the future
for the protection of system indices.

Backport of #52385
2020-03-03 14:11:36 -07:00
Nik Everett 7339427af5
Remove some deprecation warnings parsing aggs (backport of #53026) (#53072)
With #50871 aggrgations should now be parsed directly by an
`ObjectParser` or `ConstructingObjectParser` without the need for the
ceremonial `parse` method. This removes 10 of those `parse` methods and
parses the aggregation directly from their `ObjectParser`.
2020-03-03 15:27:49 -05:00
Luca Cavanna 8a05b670ca
Address MinAndMax generics warnings (#52642)
`MinAndMax` encapsulates min and max values for a shard. It uses generics to make sure that the values are of the same type and are also comparable. Though there are warnings whenever this class is currently used, which are addressed with this commit.

Relates to #49092
2020-03-03 16:08:10 +01:00
Adrien Grand cb868d2f5e
Introduce a `constant_keyword` field. (#49713) (#53024)
This field is a specialization of the `keyword` field for the case when all
documents have the same value. It typically performs more efficiently than
keywords at query time by figuring out whether all or none of the documents
match at rewrite time, like `term` queries on `_index`.

The name is up for discussion. I liked including `keyword` in it, so that we
still have room for a `singleton_numeric` in the future. However I'm unsure
whether to call it `singleton`, `constant` or something else, any opinions?

For this field there is a choice between
 1. accepting values in `_source` when they are equal to the value configured
    in mappings, but rejecting mapping updates
 2. rejecting values in `_source` but then allowing updates to the value that
    is configured in the mapping
This commit implements option 1, so that it is possible to reindex from/to an
index that has the field mapped as a keyword with no changes to the source.

Backport of #49713
2020-03-03 16:01:47 +01:00
Jason Tedor a154f9c657
Early return if no global checkpoint listeners (#53036)
When notifying global checkpoint listeners, we have an opportunity to
early return if there are not any registered listeners. This is
important since it saves some allocations, and also saves forking some
empty work to another thread. This commit adds an early return from
notifying listeners if there are not any registered.
2020-03-02 23:28:22 -05:00
Stuart Tettemer 210aab0935
Settings: AffixSettings as validator dependencies (#52973) (#52982)
Allow AffixSetting as validator dependencies.  If a validator
specifies AffixSettings as a dependency, then `validate(T, Map)`
will have the concrete setting in a map.

Backport of: #52973, 1e0ba70
Fixes: #52933
2020-02-29 09:38:46 -07:00
Nhat Nguyen e6755afeeb
Upgrade to Lucene 8.5.0-snapshot-c4475920b08 (#52950) (#52977)
To give LUCENE-9228 more CI cycles
2020-02-29 09:29:16 -05:00
Jay Modi 1cd0eee723
Remove TODO in IndexNameExpressionResolver (#52969)
This commit removes a TODO in the IndexNameExpressionResolver that
indicated the API should use a Set instead of a List. However, this
TODO was not completely correct since the ordering of arguments matters
due to negations when evaluating wildcards and since we also allow
a list of patterns like `*,-foo,*`, which would have a different
meaning even when using a Set with insertion ordering.

Relates #52788
Backport of #52963
2020-02-28 13:56:28 -07:00
Adrien Grand 331d4bb0af
HybridDirectory should mmap postings. (#52641) (#52873)
Since version 8.4, `MMapDirectory` has an optimization to read long[]
arrays directly in little endian order, which postings leverage. So it'd
be more efficient to open postings with `MMapDirectory`.

I refactored a bit the existing logic to better explain why every listed
file extension is open with `mmap`.
2020-02-28 18:45:46 +01:00
Martijn van Groningen 6aa9aaa2c6
Add validation for dynamic templates (#52890)
Backport of #51233 to the seven dot x branch.

Tries to load a `Mapper` instance for the mapping snippet of a dynamic template.
This should catch things like using an analyzer that is undefined or mapping attributes that are unused.

This is best effort:
* If `{{name}}` placeholder is used in the mapping snippet then validation is skipped.
* If `match_mapping_type` is not specified then validation is performed for all mapping types.
  If parsing succeeds with a single mapping type then this the dynamic mapping is considered valid.

If is detected that a dynamic template mapping snippet is invalid at mapping update time then the mapping update is failed for indices created on 8.0.0-alpha1 and later. For indices created on prior version a deprecation warning is omitted instead. In 7.x clusters the mapping update will never fail in case of an invalid dynamic template mapping snippet and a deprecation warning will always be omitted.

Closes #17411
Closes #24419

Co-authored-by: Adrien Grand <jpountz@gmail.com>
2020-02-28 10:35:04 +01:00
Nik Everett 407101c39b
Clean and document sorting with partialy built buckets (backport of #52769) (#52925)
The `terms` aggregation can be sortd by the results of its
sub-aggregations. Because it uses that sorting for filtering to the
top-n it tries not to construct all of the buckets for the child
aggregations. This has its own interesting problem around reduction, but
they aren't super relevant to this change. This change moves that
optimization from the `TermsAggregator` and into the aggregators being
sorted on. This should make it more clear what is going on and it
unifies this optimization with validating the sort.

Finally, this should enable some minor optimizations to save a few
comparisons when sorting multi-valued buckets. I'll get those in a
follow up because they are now *fairly* obvious. They probably won't be
a huge performance improvement, but it'll be nice anyway.
2020-02-27 17:50:55 -05:00
Nik Everett 1d1956ee93
Add size support to `top_metrics` (backport of #52662) (#52914)
This adds support for returning the top "n" metrics instead of just the
very top.

Relates to #51813
2020-02-27 16:12:52 -05:00
Lee Hinman e139d70abe Remove TODO in MaxSizeCondition (#52854)
Similar to what we did in #52794, this removes the TODO.

Relates again to #52505
2020-02-27 09:29:12 -07:00
Dan Hermann 3c8b46a8c1
[7.x] Handle errors when evaluating if conditions in processors (#52892) 2020-02-27 09:00:51 -06:00
hezhen Zhang 280d59c724 Append index name for the source of the cluster put-mapping task (#52690)
Add index name(s) into the source for the cluster state update done when putting mapping.
This ensures that the pending tasks API includes information on source indices.
2020-02-27 12:16:24 +01:00
David Turner 52fa465300
Cache completion stats between refreshes (#52872)
Computing the stats for completion fields may involve a significant amount of
work since it walks every field of every segment looking for completion fields.
Innocuous-looking APIs like `GET _stats` or `GET _cluster/stats` do this for
every shard in the cluster. This repeated work is unnecessary since these stats
do not change between refreshes; in many indices they remain constant for a
long time.

This commit introduces a cache for these stats which is invalidated on a
refresh, allowing most stats calls to bypass the work needed to compute them on
most shards.

Closes #51915
Backport of #51991
2020-02-27 10:01:24 +00:00
Nhat Nguyen 814c275f35 Add more assertions to testMaybeFlush (#52792)
We aren't able to reproduce or figure out the reason that failed this test.
This commit adds more assertions so we can narrow the scope.

Relates #52223
2020-02-26 17:08:18 -05:00
Nhat Nguyen 0a15a6bfad Fix testSeqNoCollision (#52588)
Adjusts the assertion as we trim translog more eagerly since #52556.

Relates #52556
Closes #52148
2020-02-26 17:08:18 -05:00
Nhat Nguyen 87e765609e Fix testResyncAfterPrimaryPromotion (#52615)
Adjusts the assertion as we might eagerly clean up translog during resync since #52556

Relates #52556
Closes #52598
2020-02-26 17:08:18 -05:00
Nhat Nguyen 5aa612c275 Fix testRestoreLocalHistoryFromTranslog (#52441)
Asserts that no new operations are made into the translog since we re-opened the engine.

Relates #51905
Closes #52410
2020-02-26 17:08:18 -05:00
Nhat Nguyen a92bf5ec61 Fix IndexShardIT#testMaybeFlush (#52247)
Since #51905, we use the local checkpoint of the safe commit to
calculate the number of uncommitted operations of a translog stats. If a
periodic flush triggered by afterWriteOperation completes before we sync
translog, then the last commit is not safe. We also need to sync
translog from Engine instead of the translog so that we can advance the
safe commit.

Relates #51905
Closes #52223
2020-02-26 17:08:18 -05:00
Nhat Nguyen d7fe135d90 Fix testPrepareIndexForPeerRecovery (#52245)
Since #51905, we skip translog recovery if the local checkpoint of the
safe commit equals to the global checkpoint. This change adjusts the
test not to create a new snapshot in that case.

Closes #52221
Relates #51905
2020-02-26 17:08:18 -05:00
Yannick Welsch 82ab1bc1ff Separate translog from index deletion conditions (#52556)
Separates the translog from the index deletion conditions (allowing the translog to be cleaned
up more eagerly), and avoids taking the write lock on the translog if no clean-up is actually
necessary.
2020-02-26 17:08:18 -05:00
Nhat Nguyen db6b9c21c7 Use local checkpoint to calculate min translog gen for recovery (#51905)
Today we use the translog_generation of the safe commit as the minimum
required translog generation for recovery. This approach has a
limitation, where we won't be able to clean up translog unless we flush.
Reopening an already recovered engine will create a new empty translog,
and we leave it there until we force flush.

This commit removes the translog_generation commit tag and uses the
local checkpoint of the safe commit to calculate the minimum required
translog generation for recovery instead.

Closes #49970
2020-02-26 17:08:18 -05:00
Dan Hermann 3ffd34617f
Switch to AtomicLong for ingestCurrent metric to prevent negative values (#52581) (#52834) 2020-02-26 13:26:26 -06:00
Jay Modi 07ef8ccff4
Allow dynamic updates for index.hidden setting (#52837)
This commit changes the `index.hidden` setting from being final to a
dynamic setting. While the setting being final allows for easier
reasoning about an index, making this setting update-able has more
benefits in that we can upgrade existing indices to be hidden and it
will enable future features that would dynamically make indices hidden.

Backport of #52772
2020-02-26 11:46:29 -07:00
Nik Everett bfaa487757
Switch pipeline agg parsing to ContextParser (#52776) (#52832)
We've pretty well settled on `ContextParser` for a generic interface to
`ObjectParser`-like-things. This switches the interface used for
building parsing pipeline aggregations to `ContextParser` which saves a
couple of little wrappers around `ObjectParser`.
2020-02-26 12:57:20 -05:00
Tim Brooks be8d704e2b
Remove seeds depedency for remote cluster settings (#52829)
Currently 3 remote cluster settings (ping interval, skip unavailable,
and compression) have a dependency on the seeds setting being
comfigured. With proxy mode, it is now possible that these settings the
seeds setting has not been configured. This commit removes this
dependency and adds new validation for these settings.
2020-02-26 10:17:25 -07:00
Adrien Grand 1807f86751
Generalize how queries on `_index` are handled at rewrite time (#52815)
Generalize how queries on `_index` are handled at rewrite time (#52486)

Since this change refactors rewrites, I also took it as an opportunity to adrress #49254: instead of returning the same queries you would get on a keyword field when a field is unmapped, queries get rewritten to a MatchNoDocsQueryBuilder.

This change exposed a couple bugs, like the fact that the percolator doesn't rewrite queries at query time, or that the significant_terms aggregation doesn't rewrite its inner filter, which I fixed.

Closes #49254
2020-02-26 15:37:43 +01:00
Luca Cavanna 9e38125464
Clarify when shard iterators get sorted (#52810)
Currently we have two ways to create a GroupShardsIterator: one that will resort the iterators based on their natural ordering, and another one that will leave them in their original order. This is currently done through two constructors, one that accepts a single argument which does the sorting, and another which accepts a second boolean argument to control whether sorting should happen or not. This second constructor is only called externally to disable the sorting.

By introducing a specific method to create a sorted shard iterator we clarify and make it easier to track when we do sort and when we do not as the iterators are externally sorted.
2020-02-26 13:58:20 +01:00
Jim Ferenczi a73ad248e8
Fix backport of #46731 (#52744)
This change fixes the incomplete backport of #46731 in 7.x (as of 7.5).
We now check if `max_children` is set on the top level nested sort and fails with an
exception if it's not the case.

Relates #46731
Closes #52202
2020-02-26 10:46:51 +01:00
Sachin Frayne d3c0a2f013 Improve the error message when loading text fielddata. (#52753)
Emphasize keyword over fielddata as the preferred way to use String fields for aggregations or sorting.
2020-02-25 15:45:44 -08:00
Lee Hinman 662f21fcea Remove TODO in MaxAgeCondition serialization (#52794)
* Remove TODO in MaxAgeCondition serialization

This removes the TODO with a message for any future readers regarding the code in question.

Resolves #52505
2020-02-25 15:47:36 -07:00
Tim Brooks c8ef9649e2
Force execution of finish shard bulk request (#51957) (#52484)
Currently the shard bulk request can be rejected by the write threadpool
after a mapping update. This introduces a scenario where the mapping
listener thread will attempt to finish the request and fsync. This
thread can potentially be a transport thread. This commit fixes this
issue by forcing the finish action to happen on the write threadpool.

Fixes #51904.
2020-02-25 14:37:11 -07:00
Nhat Nguyen 848d3bc153 Revert "Fix testKeepTranslogAfterGlobalCheckpoint"
This reverts commit a88d54eb2d.
2020-02-25 14:12:35 -05:00
Nhat Nguyen a88d54eb2d Fix testKeepTranslogAfterGlobalCheckpoint
Read the last synced global checkpoint after flushing as
we might advance it during committing.

CI: https://gradle-enterprise.elastic.co/s/7o6qengg4gva2
2020-02-25 11:49:24 -05:00
Alan Woodward 638f3e4183 Use ByteBuffersDirectory rather than RAMDirectory (#52768)
Lucene's RAMDirectory has been deprecated. This commit replaces all uses of
RAMDirectory in elasticsearch with the newer ByteBuffersDirectory. Most uses
are in tests, but the percolator and painless executor may get some small speedups.
2020-02-25 15:46:35 +00:00
Alan Woodward 18663b0a85 Don't index ranges including NOW in percolator (#52748)
Currently, date ranges queries using NOW-based date math are rewritten to
MatchAllDocs queries when being preprocessed for the percolator. However,
since we added the verification step, this can result in incorrect matches when
percolator queries are run without scores. This commit changes things to instead
wrap date queries that use NOW with a new DateRangeIncludingNowQuery.
This is a simple wrapper query that returns its delegate at rewrite time, but it can
be detected by the percolator QueryAnalyzer and be dealt with accordingly.

This also allows us to remove a method on QueryRewriteContext, and push all
logic relating to NOW-based ranges into the DateFieldMapper.

Fixes #52617
2020-02-25 12:18:16 +00:00
Ryan Ernst 5fba8cbc7b Rename local Environment var in Node to avoid confusion (#52602)
When the Node class is being constructed, an initial environment is
passed in with the initial settings for the node. Once the plugin
servicie is initialized, the final Environment+Settings are created, at
which point the initial environment should no longer be used. This
commit renames the constructor arg to avoid naming clashes with the
final environment variable.
2020-02-24 11:14:46 -08:00
Lee Hinman 7d9de8412a
[7.x] fix npe in RestPluginsAction (#52620) (de56de9a) (#52721)
Relates #45321

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Kaihong.Wang <kyra.wkh@alibaba-inc.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-02-24 11:57:01 -07:00
Mayya Sharipova 034b1c0ba3
Correct boost calculation in script_score query (#52478) (#52724)
Before boost in script_score query was wrongly applied only to the subquery.
This commit makes sure that the boost is applied to the whole score
that comes out of script.

Closes #48465
2020-02-24 13:48:21 -05:00
Adrien Grand f993ef80f8
Move the terms index of `_id` off-heap. (#52518)
In #42838 we moved the terms index of all fields off-heap except the
`_id` field because we were worried it might make indexing slower. In
general, the indexing rate is only affected if explicit IDs are used, as
otherwise Elasticsearch almost never performs lookups in the terms
dictionary for the purpose of indexing. So it's quite wasteful to
require the terms index of `_id` to be loaded on-heap for users who have
append-only workloads. Furthermore I've been conducting benchmarks when
indexing with explicit ids on the http_logs dataset that suggest that
the slowdown is low enough that it's probably not worth forcing the terms
index to be kept on-heap. Here are some numbers for the median indexing
rate in docs/s:

| Run | Master  | Patch   |
| --- | ------- | ------- |
| 1   | 45851.2 | 46401.4 |
| 2   | 45192.6 | 44561.0 |
| 3   | 45635.2 | 44137.0 |
| 4   | 46435.0 | 44692.8 |
| 5   | 45829.0 | 44949.0 |

And now heap usage in MB for segments:

| Run | Master  | Patch    |
| --- | ------- | -------- |
| 1   | 41.1720 | 0.352083 |
| 2   | 45.1545 | 0.382534 |
| 3   | 41.7746 | 0.381285 |
| 4   | 45.3673 | 0.412737 |
| 5   | 45.4616 | 0.375063 |

Indexing rate decreased by 1.8% on average, while memory usage decreased
by more than 100x.

The `http_logs` dataset contains small documents and has a simple
indexing chain. More complex indexing chains, e.g. with more fields,
ingest pipelines, etc. would see an even lower decrease of indexing rate.
2020-02-24 18:14:12 +01:00
Alan Woodward 7dc41a3b83 Use BoostQuery rather than FunctionScoreQuery for query-time indices_boost (#52272)
This is a trivial change, but it should result in a slightly more efficient query boost.
2020-02-24 14:41:46 +00:00
Nik Everett d26d7721ea
Continue realizing sorting by aggregations (backport of #52298) (#52667)
This drops more of the `instanceof`s from `AggregationPath`. There are
still a couple in `AggregationPath`. And I ended up moving two into
`BucketsAggregator`, but I think this is still an improvement!
2020-02-23 17:13:55 -05:00
bellengao 02cb5b6c0e Return 429 status code on read_only_allow_delete index block (#50166)
We consider index level read_only_allow_delete blocks temporary since
the DiskThresholdMonitor can automatically release those when an index
is no longer allocated on nodes above high threshold.

The rest status has therefore been changed to 429 when encountering this
index block to signal retryability to clients.

Related to #49393
2020-02-22 16:24:25 +01:00
Jay Modi 8abfda0b59
Rename assertThrows to prevent naming clash (#52651)
This commit renames ElasticsearchAssertions#assertThrows to
assertRequestBuilderThrows and assertFutureThrows to avoid a
naming clash with JUnit 4.13+ and static imports of these methods.
Additionally, these methods have been updated to make use of
expectThrows internally to avoid duplicating the logic there.

Relates #51787
Backport of #52582
2020-02-21 13:30:11 -07:00
Stuart Tettemer 376932a47d
Scripting: split out compile limits and caching (#52498) (#52652)
Phase 1 of adding compilation limits per context.
* Refactor rate limiting and caching into separate class,
  `ScriptCache`,  which will be used per context.
* Disable compilation limit for certain tests.

Backport of 0866031
Refs: #50152
2020-02-21 12:10:51 -07:00
Jay Modi f3f6ff97ee
Single instance of the IndexNameExpressionResolver (#52604)
This commit modifies the codebase so that our production code uses a
single instance of the IndexNameExpressionResolver class. This change
is being made in preparation for allowing name expression resolution
to be augmented by a plugin.

In order to remove some instances of IndexNameExpressionResolver, the
single instance is added as a parameter of Plugin#createComponents and
PersistentTaskPlugin#getPersistentTasksExecutor.

Backport of #52596
2020-02-21 07:50:02 -07:00
markharwood 96d603979b
Upgrade Lucene to 8.5.0-snapshot-b01d7cb (#52584)
Upgrading 7x to same Lucene 8.5 version used in master
2020-02-21 10:25:03 +00:00
Armin Braun 0a09e15959
Add Caching for RepositoryData in BlobStoreRepository (#52341) (#52566)
Cache latest `RepositoryData` on heap when it's absolutely safe to do so (i.e. when the repository is in strictly consistent mode).

`RepositoryData` can safely be assumed to not grow to a size that would cause trouble because we often have at least two copies of it loaded at the same time when doing repository operations. Also, concurrent snapshot API status requests currently load it independently of each other and so on, making it safe to cache on heap and assume as "small" IMO.

The benefits of this move are:
* Much faster repository status API calls
   * listing all snapshot names becomes instant
   * Other operations are sped up massively too because they mostly operate in two steps: load repository data then load multiple other blobs to get the additional data
* Additional cloud cost savings
* Better resiliency, saving another spot where an IO issue could break the snapshot
* We can simplify a number of spots in the current code that currently pass around the repository data in tricky ways to avoid loading it multiple times in follow ups.
2020-02-21 10:20:07 +01:00
Armin Braun 4bb780bc37
Refactor Inflexible Snapshot Repository BwC (#52365) (#52557)
* Refactor Inflexible Snapshot Repository BwC (#52365)

Transport the version to use for  a snapshot instead of whether to use shard generations in the snapshots in progress entry. This allows making upcoming repository metadata changes in a flexible manner in an analogous way to how we handle serialization BwC elsewhere.
Also, exposing the version at the repository API level will make it easier to do BwC relevant changes in derived repositories like source only or encrypted.
2020-02-21 09:14:34 +01:00
Ignacio Vera 107f00a4ec
Add support for multipoint geoshape queries (#52133) (#52553)
Currently multi-point queries are not supported when indexing your data using BKD-backed geoshape strategy. This commit removes this limitation.
2020-02-21 07:45:53 +01:00
Yannick Welsch d76358c875
Deprecate fixed_auto_queue_size thread pool type (#52399)
Relates #52280
2020-02-20 11:11:06 +01:00
Yannick Welsch 3afb5ca133 Fix synchronization in ByteSizeCachingDirectory (#52512)
One particular code place was synchronizing on the wrong object.
2020-02-19 16:10:39 +01:00
Przemysław Witek 7cd997df84
[ML] Make ml internal indices hidden (#52423) (#52509) 2020-02-19 14:02:32 +01:00
Ignacio Vera 8d2261fe47
Refactor GeoShapeIndexer by extracting polygon / line decomposers (#52422) (#52506)
Refactor GeoShapeIndexer. We extract Polygon and Line decomposers which are in charge of breaking a shape around the dateline if needed.
2020-02-19 12:04:29 +01:00
Henning Andersen 9d40277d4c Deciders should not by default collect yes'es (#52438)
AllocationDeciders would collect Yes decisions when not asking for debug
info. Changed to only include Yes decisions when debug is requested
(explain).
2020-02-19 11:18:03 +01:00
Henning Andersen d4bc3b75dc Reindex: allow comma separated source indices (#52044)
Added ability to specify comma separated list of source indices without
array. Also fixed so that empty string results in validation error
rather than index does not exist.

Closes #51949
2020-02-19 09:23:15 +01:00
David Turner baf184c93f Avoid using WindowsFS in ClusterRerouteIT (#52488)
Issue #52000 looks like a case of cluster state updates being slower than
expected, but it seems that these slowdowns are relatively rare: most
invocations of `testDelayWithALargeAmountOfShards` take well under a minute in
CI, but there are occasional failures that take 6+ minutes instead.  When it
fails like this, cluster state persistence seems generally slow: most are
slower than expected, with some small updates even taking over 2 seconds to
complete.

The failures all have in common that they use `WindowsFS` to emulate Windows'
behaviour of refusing to delete files that are still open, by tracking all
files (really, inodes) and validating that deleted files are really closed
first. There is a suggestion that this is a little slow in the Lucene test
framework [1]. To see if we can attribute the slowdown to that common factor,
this commit suppresses the use of `WindowsFS` for this test suite.

[1] 4a513fa99f/lucene/test-framework/src/java/org/apache/lucene/util/TestRuleTemporaryFilesCleanup.java (L166)
2020-02-19 07:52:49 +00:00
Tim Brooks 8038f9bba6
Do not lock when generating time based uuid (#52436)
Currently we lock when generating time based uuids. The lock is
implemented to prevent concurrent writes to the last timestamp. The uuid
generation is an area of contention when indexing. This commit modifies
the code to use atomic compare and set operations to update the last
timestamp.
2020-02-18 09:55:51 -07:00
Tim Brooks 7fcd997b39
Do not lock on settings keyset if keys initialized (#52435)
Every time a setting#exist call is made we lock on the keyset to ensure
that it has been initialized. This a heavyweight operation that only
should be done once. This commit moves to a volatile read instead to
prevent unnecessary locking.
2020-02-18 09:36:07 -07:00
Tim Brooks a742c58d45
Extract a ConnectionManager interface (#51722)
Currently we have three different implementations representing a
`ConnectionManager`. There is the basic `ConnectionManager` which
holds all connections for a cluster. And a remote connection manager
which support proxy behavior. And a stubbable connection manager for
tests. The remote and stubbable instances use the delegate pattern,
so this commit extracts an interface for them all to implement.
2020-02-18 09:19:24 -07:00
Benedict Jin 0c4f7dc193
Minor code improvements (#51921)
Fix some whitespaces, comments and usage of `this.`.

(cherry picked from commit 9f59900bf6389172811eb2279c17a2dc7cd9dfdf)
2020-02-18 16:00:05 +01:00
David Turner 3d57a78deb Add extra logging for investigation into #52000 (#52472)
It looks like #52000 is caused by a slowdown in cluster state application
(maybe due to #50907) but I would like to understand the details to ensure that
there's nothing else going on here too before simply increasing the timeout.
This commit enables some relevant `DEBUG` loggers and also captures stack
traces from all threads rather than just the three hottest ones.
2020-02-18 13:02:33 +00:00
Armin Braun 57d6dd7e31
Fix Non-Verbose Snapshot List Missing Empty Snapshots (#52433) (#52456)
We were not including snapshots without indices in the non-verbose
listing because we used the snapshot -> indices mapping to get the
snapshots.
2020-02-18 11:37:53 +01:00
Armin Braun cc628748e1
Optimize FilterStreamInput for Network Reads (#52395) (#52403)
When `FilterStreamInput` wraps a Netty `ByteBuf` based stream it
did not forward the bulk primitive reads to the delegate.
These are optimized on the delegate but if they're not forwarded
then the delegate will be called e.g. 4 times to read an `int`.
This happens for essentially all network reads prior to this
change because they all run from a `NamedWritableAwareStreamInput`.

This also required optimising `BufferedChecksumStreamInput` individually to use bulk reads from the buffer because it implicitly assumed that the filter stream input  wouldn't override any of the bulk operations.
2020-02-17 13:07:19 +01:00
Nik Everett 146def8caa
Implement top_metrics agg (#51155) (#52366)
The `top_metrics` agg is kind of like `top_hits` but it only works on
doc values so it *should* be faster.

At this point it is fairly limited in that it only supports a single,
numeric sort and a single, numeric metric. And it only fetches the "very
topest" document worth of metric. We plan to support returning a
configurable number of top metrics, requesting more than one metric and
more than one sort. And, eventually, non-numeric sorts and metrics. The
trick is doing those things fairly efficiently.

Co-Authored by: Zachary Tong <zach@elastic.co>
2020-02-14 11:19:11 -05:00
Nik Everett 53b6583fed
Decode max and min optimization more carefully (#52336) (#52358)
Fixes the the no-query optimization for `min` and `max` aggregations
for `date_nanos` fields by delegating decoding dates "through" their
`resolution` member.

Closes #52220
2020-02-14 07:07:56 -05:00
Julie Tibshirani 0d7165a40b Standardize naming of fetch subphases. (#52171)
This commit makes the names of fetch subphases more consistent:
* Now the names end in just 'Phase', whereas before some ended in
  'FetchSubPhase'. This matches the query subphases like AggregationPhase.
* Some names include 'fetch' like FetchScorePhase to avoid ambiguity about what
  they do.
2020-02-13 13:00:46 -08:00
Nik Everett 2dac36de4d
HLRC support for string_stats (#52163) (#52297)
This adds a builder and parsed results for the `string_stats`
aggregation directly to the high level rest client. Without this the
HLRC can't access the `string_stats` API without the elastic licensed
`analytics` module.

While I'm in there this adds a few of our usual unit tests and
modernizes the parsing.
2020-02-12 19:25:05 -05:00
Nik Everett 7efce22f19
Fix a DST error in date_histogram (backport #52016) (#52237)
When `date_histogram` attempts to optimize itself it for a particular
time zone it checks to see if the entire shard is within the same
"transition". Most time zone transition once every size months or
thereabouts so the optimization can usually kicks in.

*But* it crashes when you attempt feed it a time zone who's last DST
transition was before epoch. The reason for this is a little twisted:
before this patch it'd find the next and previous transitions in
milliseconds since epoch. Then it'd cast them to `Long`s and pass them
into the `DateFieldType` to check if the shard's contents were within
the range. The trouble is they are then converted to `String`s which are
*then* parsed back to `Instant`s which are then convertd to `long`s. And
the parser doesn't like most negative numbers. And everything before
epoch is negative.

This change removes the
`long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of
passing the `long` -> `Instant` -> `long` which avoids the fairly complex
parsing code and handles a bunch of interesting edge cases around
epoch. And other edge cases around `date_nanos`.

Closes #50265
2020-02-12 17:57:04 -05:00
Nhat Nguyen 12cb6dcefe Fix testFlushOnInactive (#52275)
We need to reduce the translog sync interval for indices with translog
async setting so that we can have the safe commit in the assertBusy
interval. This is needed since #51905, where we use the local checkpoint
of the safe commit to calculate the number of uncommitted operations of
a translog stats.

Closes #52251
Relates #51905
2020-02-12 17:19:02 -05:00
Jay Modi 5bcc6fce5c
Remove DeprecationLogger from route objects (#52285)
This commit removes the need for DeprecatedRoute and ReplacedRoute to
have an instance of a DeprecationLogger. Instead the RestController now
has a DeprecationLogger that will be used for all deprecated and
replaced route messages.

Relates #51950
Backport of #52278
2020-02-12 15:05:41 -07:00
Marios Trivyzas dac720d7a1
Add a cluster setting to disallow expensive queries (#51385) (#52279)
Add a new cluster setting `search.allow_expensive_queries` which by
default is `true`. If set to `false`, certain queries that have
usually slow performance cannot be executed and an error message
is returned.

- Queries that need to do linear scans to identify matches:
  - Script queries
- Queries that have a high up-front cost:
  - Fuzzy queries
  - Regexp queries
  - Prefix queries (without index_prefixes enabled
  - Wildcard queries
  - Range queries on text and keyword fields
- Joining queries
  - HasParent queries
  - HasChild queries
  - ParentId queries
  - Nested queries
- Queries on deprecated 6.x geo shapes (using PrefixTree implementation)
- Queries that may have a high per-document cost:
  - Script score queries
  - Percolate queries

Closes: #29050
(cherry picked from commit a8b39ed842c7770bd9275958c9f747502fd9a3ea)
2020-02-12 22:56:14 +01:00
Ryan Ernst c07f46409c Fix single newline in logging output stream buffer (#52253)
The buffer in LoggingOutputStream skips flushing when only a newline
appears. However, if a windows newline appeared, the buffer length was
not reset. This commit resets the length so the \r does not appear in
the next logging message.

closes #51838
2020-02-12 10:48:55 -08:00
Nhat Nguyen e098e837f7 Fix testShouldPeriodicallyFlushAfterMerge (#52243)
MockRandomMergePolicy randomly determines if a segment should use a 
compound format. This can cause a force merge performing two merges: (1)
merging to a single segment, (2) rewriting the new segment using the
compound format. If the second merge completes after we have flushed,
then it can flip the flag shouldPeriodicallyFlushAfterBigMerge to true.

Closes #52205
2020-02-12 11:25:39 -05:00
Gordon Brown d48ce12920
Convert ILM and SLM histories into hidden indices (#51456)
Modifies SLM's and ILM's history indices to be hidden indices for added
protection against accidental querying and deletion, and improves
IndexTemplateRegistry to handle upgrading index templates.

Also modifies the REST test cleanup to delete hidden indices.
2020-02-11 14:18:55 -07:00
Nik Everett 86d5211c05
Make sorting by an agg results a real abstraction (#52007) (#52212)
This removes a bunch of `instanceof`s in favor of two new methods on
`InernalAggregation`. The default implementations of these methods just
throw exceptions explaining that you can't sort on this aggregation.
They are overridden by all of the classes that used to have `instanceof`
checks against them.

I doubt this is really any faster in practice. The real benefit here is
that it is a little more obvious *that* you can sort by the results of
an aggregation and it should be *much* more obvious where to look at
*how* aggregations sort themselves.

There are still a bunch more `instanceof`s in left in `AggregationPath`
but those will wait for a followup change.
2020-02-11 12:58:40 -05:00
Hendrik Muhs 098380e483 Percentiles aggregation validation checks for range (#51871)
disallow to specify percentile out of range [0,100]. This also fixes a problem in transform by failing
validation if an invalid percentile configuration is used.
2020-02-11 17:25:39 +01:00
David Roberts 473468d763 [ML] Better error when persistent task assignment disabled (#52014)
Changes the misleading error message when attempting to open
a job while the "cluster.persistent_tasks.allocation.enable"
setting is set to "none" to a clearer message that names the
setting.

Closes #51956
2020-02-11 15:23:21 +00:00
Zachary Tong 87854573e4 Add version constant for 7.6.1 2020-02-11 09:44:43 -05:00
Igor Motov 667e1a5225
Add Boxplot Aggregation (#52174)
Adds a `boxplot` aggregation that calculates min, max, medium and the first
and the third quartiles of the given data set.

Closes #33112
2020-02-11 09:38:17 -05:00
David Turner 00b9098250 Ignore timeouts with single-node discovery (#52159)
Today we use `cluster.join.timeout` to prevent nodes from waiting indefinitely
if joining a faulty master that is too slow to respond, and
`cluster.publish.timeout` to allow a faulty master to detect that it is unable
to publish its cluster state updates in a timely fashion. If these timeouts
occur then the node restarts the discovery process in an attempt to find a
healthier master.

In the special case of `discovery.type: single-node` there is no point in
looking for another healthier master since the single node in the cluster is
all we've got. This commit suppresses these timeouts and instead lets the node
wait for joins and publications to succeed no matter how long this might take.
2020-02-11 14:15:01 +00:00
David Kyle 343ced42be Mute LoggingOutputStreamTests.testMaxBuffer (#52193)
Relates to https://github.com/elastic/elasticsearch/issues/51838
2020-02-11 11:46:17 +00:00
Gordon Brown 350288ddf8
Check dot-index rules after template application (#52087)
Previously, the dot-index rules (namely, that indices with dot-prefixed
names should be either hidden indices or system indices) was done
before* template application, and so only checked for the `index.hidden`
setting in the request, ignoring if that setting was set via a template.

This commit moves that check to a different method, which is applied
after templates have been resolved and applied to the index settings.
2020-02-10 17:01:59 -07:00
Ryan Ernst 88cf8ac0a8 Fix windows empty line in logging capture (#52162)
This commit fixes another edge case in handling windows newlines in our
capture of stdout/stderr to log4j. The case is that the \r appears at
the beginning of the buffer when flushing, which would unintentionally
be emitted as an empty string. This commit skips the flush if only a \r
was found.

closes #51838
2020-02-10 13:29:50 -08:00
Julie Tibshirani 28a8db730f In FieldTypeLookup, factor out flat object field logic. (#52091)
Currently, the logic for looking up `flattened` field types lives in the
top-level `FieldTypeLookup`. This PR moves it into a dedicated class
`DynamicKeyFieldTypeLookup`.
2020-02-10 10:44:02 -08:00
Armin Braun d8169e5fdc
Don't Upload Redundant Shard Files (#51729) (#52147)
Segment(s) info blobs are already stored with their full content
in the "hash" field in the shard snapshot metadata as long as they are
smaller than 1MB. We can make use of this fact and never upload them
physically to the repo.
This saves a non-trivial number of uploads and downloads when restoring
and might also lower the latency of searchable snapshots since they can save
phyiscally loading this information as well.
2020-02-10 16:50:09 +01:00
Ignacio Vera 80e3c97210 Upgrade to lucene-8.5.0-snapshot-d62f6307658 (#52039) (#52130) 2020-02-10 10:13:22 +01:00
Alan Woodward 9b7e688f5b Don't use a static QueryShardResult for a null instance (#52063)
Fixes #52042
2020-02-10 09:03:43 +00:00
Ioannis Kakavas 343fb36c7f Test modifications for FIPS 140 mode (#51832) (#52128)
- Enable SunJGSS provider for Kerberos tests
- Handle the fact that in the decrypt method in KeyStoreWrapper might
not throw immediately when the GCM cipher is from BouncyCastle FIPS
and we end up with a DataInputStream that has reached it's end.
- Disable tests, jarHell, testingConventions for ingest attachment
plugin. We don't support this plugin (and document this) in FIPS
mode.
- Don't attempt to install ingest-attachment in smoke-test-plugins
2020-02-10 10:57:03 +02:00
Jay Modi 3edadfefd0 RestHandlers declare handled routes (#52123)
This commit changes how RestHandlers are registered with the
RestController so that a RestHandler no longer needs to register itself
with the RestController. Instead the RestHandler interface has new
methods which when called provide information about the routes
(method and path combinations) that are handled by the handler
including any deprecated and/or replaced combinations.

This change also makes the publication of RestHandlers safe since they
no longer publish a reference to themselves within their constructors.

Closes #51622

Co-authored-by: Jason Tedor <jason@tedor.me>

Backport of #51950
2020-02-09 22:48:32 -07:00
Nhat Nguyen 80a9a08b05 Fix leaking searcher when shards are removed or relocated (#52099)
We might leak a searcher if the target shard is removed (i.e., its index
is deleted) or relocated while we are creating a SearchContext from a
SearchRewriteContext.

Relates #51708
Closes #52021

I labelled this non-issue for an unreleased bug introduced in #51708.
2020-02-09 22:13:35 -05:00
Armin Braun 90eb6a020d Remove Redundant Loading of RepositoryData during Restore (#51977) (#52108)
We can just put the `IndexId` instead of just the index name into the recovery soruce and
save one load of `RepositoryData` on each shard restore that way.
2020-02-09 21:44:18 +01:00
Nhat Nguyen 9f541d909d Always create search context for scroll queries (#52078)
We need to either exclude null responses from the scroll search response
or always create a search context for every target shards, although that
scroll query can be written to match_no_docs. Otherwise, we won't find
search_context for subsequent scroll requests.

This commit implements the latter option as it's less error-prone.

Relates #51708
2020-02-08 13:01:01 -05:00
Armin Braun b77ef1f61b
Cleanup some Dead Code in o.e.index.store (#52045) (#52084)
One obviously unused method and an incorrect Javadoc that
referenced an otherwise unused class.
2020-02-08 12:14:51 +01:00
Mark Vieira e5a9e44ca4
Mute IndicesRequestCacheIT.testQueryRewriteDatesWithNow()
Signed-off-by: Mark Vieira <portugee@gmail.com>
2020-02-07 13:14:32 -08:00
Julie Tibshirani 337d73a7c6 Rename MapperService#fullName to fieldType.
The new name more accurately describes what the method returns.
2020-02-07 10:35:53 -08:00
Tanguy Leroux 7c6264b28c Mute IndicesRequestCacheIT.testQueryRewrite()
Relates #32827
2020-02-07 19:44:32 +02:00
Armin Braun 91e938ead8
Add Trace Logging of REST Requests (#51684) (#52015)
Being able to trace log all REST requests to a node would make debugging
a number of issues a lot easier.
2020-02-07 09:03:20 +01:00
Jim Ferenczi 0f333c89b9
Always rewrite search shard request outside of the search thread pool (#51708) (#51979)
This change ensures that the rewrite of the shard request is executed in the network thread or in the refresh listener when waiting for an active shard. This allows queries that rewrite to match_no_docs to bypass the search thread pool entirely even if the can_match phase was skipped (pre_filter_shard_size > number of shards). Coordinating nodes don't have the ability to create empty responses so this change also ensures that at least one shard creates a full empty response while the other can return null ones. This is needed since creating true empty responses on shards require to create concrete aggregators which would be too costly to build on a network thread. We should move this functionality to aggregation builders in a follow up but that would be a much bigger change.
This change is also important for #49601 since we want to add the ability to use the result of other shards to rewrite the request of subsequent ones. For instance if the first M shards have their top N computed, the top worst document in the global queue can be pass to subsequent shards that can then rewrite to match_no_docs if they can guarantee that they don't have any document better than the provided one.
2020-02-06 10:53:11 +01:00
Jim Ferenczi fb710cc62b Remove the query builder serialization from QueryShardException message (#51885)
QueryBuilders that throw exceptions on shards when building the Lucene query
returns the full serialization of the query builder in the exception message.
For large queries that fails to execute due to the max boolean clause, this means
that we keep a reference of these big messages for every shard that participate
in the request. In order to limit the memory needed to hold these query shard
exceptions in the coordinating node, this change removes the query builder
serialization from the shard exception. The query is known by the user so
there should be no need to repeat it on every shard exception. We could also
omit the entire stack trace for known bad request exception but it would deserve
a separate issue/pr.

Closes #51843
Closes #48910
2020-02-06 08:26:15 +01:00
Nik Everett 80e29a47d8
Fix a sneaky bug in rare_terms (#51868) (#51959)
When the `rare_terms` aggregation contained another aggregation it'd
break them. Most of the time. This happened because the process that it
uses to remove buckets that turn out not to be rare was incorrectly
merging results from multiple leaves. This'd cause array index out of
bounds issues. We didn't catch it in the test because the issue doesn't
happen on the very first bucket. And the tests generated data in such a
way that the first bucket always contained the rare terms. Randomizing
the order of the generated data fixed the test so it caught the issue.

Closes #51020
2020-02-05 16:32:55 -05:00
Adrien Grand ad9d2f1922
Move analysis/mappings stats to cluster-stats. (#51875)
Closes #51138
2020-02-05 11:02:25 +01:00
Yannick Welsch b4480bb8a4 Mute LoggingOutputStreamTests (#51917)
Relates #51838
2020-02-05 10:46:45 +01:00
Julie Tibshirani 38ce428831
Create a class to hold field capabilities for one index. (#51844)
Currently, the same class `FieldCapabilities` is used both to represent the
capabilities for one index, and also the merged capabilities across indices. To
help clarify the logic, this PR proposes to create a separate class
`IndexFieldCapabilities` for the capabilities in one index. The refactor will
also help when adding `source_path` information in #49264, since the merged
source path field will have a different structure from the field for a single index.

Individual changes:
* Add a new class IndexFieldCapabilities.
* Remove extra constructor from FieldCapabilities.
* Combine the add and merge methods in FieldCapabilities.Builder.
2020-02-04 11:24:57 -08:00
Maria Ralli 8d3e73b3a0 Add host address to BindTransportException message (#51269)
When bind fails, show the host address in addition to the port. This
helps debugging cases with wrong "network.host" values.

Closes #48001
2020-02-04 17:13:19 +00:00
feifeiiiiiiiiii 337153b29f Throw better exception on wrong `dynamic_templates` syntax (#51783)
Currently, a mappings update request, where dynamic_mappings is an object
instead of an array, results in a http response with a 500 code. This PR checks
for this condition and throws a MapperParsingException like we do for other
malformed mapping cases.

Closes #51486
2020-02-04 17:01:55 +01:00
Henning Andersen 41552359a2 Increase master disruption test assert timeouts (#51810)
After #51803, the timeouts waiting for assertions around master change
were too short.
2020-02-03 15:51:33 +01:00
Henning Andersen 1800b2730f Fix completeWith exception handling (#51734)
ActionListener.completeWith would catch exceptions from
listener.onResponse and deliver them to lister.onFailure, essentially
double notifying the listener. Instead we now assert that listeners do
not throw when using ActionListener.completeWith.

Relates #50886
2020-02-03 14:22:55 +01:00
Przemyslaw Gomulka a6d24d6a46
Fix ingest timezone logic backport(#51215) (#51802)
when a timezone is not provided Ingest logic should consider a time to be in a timezone provided as a parameter.
When a timezone is provided Ingest should recalculate a time to the timezone provided as a parameter

closes #51108
backport(#51215)
2020-02-03 14:17:43 +01:00
Henning Andersen 918dfaff1f Increase disruption test publish timeout to 5s (#51803)
With the new mechanism for storing cluster state in lucene, we store
index metadata in multiple data paths too. This causes cluster state
publish to timeout too frequently with a 1s timeout, so increasing it to
5s. Also increasing follower check timeout to 5s since it also sometimes
has fsync in its timeout path and leader check for symmetry.

Closes #51329
2020-02-03 13:57:57 +01:00
Ryan Ernst 21224caeaf Remove comparison to true for booleans (#51723)
While we use `== false` as a more visible form of boolean negation
(instead of `!`), the true case is implied and the true value does not
need to explicitly checked. This commit converts cases that have slipped
into the code checking for `== true`.
2020-01-31 16:35:43 -08:00
Ryan Ernst 61622c4f0c Fix LoggingOutputStream to work on windows (#51779)
LoggingOutputStream reads a stream and breaks on newlines. This commit
fixes the behavior to account for windows newlines also containing `\r`.

closes #51532
2020-01-31 16:30:10 -08:00
Noor 70bb7c862d Fixed typo in comment (#51745)
Comment said the supporter highlighter type was fvj, there's no such highlighter. It is supposed to say fvh for fast vector highlighting.
2020-01-31 10:30:21 -08:00
Mayya Sharipova 42b885f050
Upgrade to lucene-8.5.0-snapshot-3333ce7da6d (#51749)
Backport for #51327
2020-01-31 11:20:15 -05:00
David Turner 39a3a950de Simplify rebalancer's weight function (#51632)
This commit inlines the `weightShardAdded` and `weightShardRemoved` methods
from the `BalancedShardsAllocator#WeightFunction` that respectively add and
subtract 1 (±ε) from the result of `weight`. It then follows up with a number
of simplifications that this inlining enables.

As a side-effect it also somewhat reduces the number of calls to canRebalance
and canAllocate during rebalancing when there are multiple shards of the same
index on a node that is heavier than average.
2020-01-31 14:40:23 +00:00
Christoph Büscher 86f3b47299
Make `date_range` query rounding consistent with `date` (#50237) (#51741)
Currently the rounding used in range queries can behave differently for `date`
and `date_range` as explained in #50009. The behaviour on `date` fields is
the one we document in https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-range-query.html#range-query-date-math-rounding.
This change adapts the rounding behaviour for RangeType.DATE so it uses the
same logic as the `date` for the `date_range` type.

Backport of #50237
2020-01-31 15:35:05 +01:00
Dominic Page d7e1215e42
Backport of #50737 to 7.x (#51662)
* Refactor GeoShape tests to GeoShape and GeoPoint (#50737)

Backport to 7.x
2020-01-31 11:55:10 +01:00
Adrien Grand 915a931e93
Bucket aggregation circuit breaker optimization. (#46751) (#51730)
Co-authored-by: Howard <danielhuang@tencent.com>
2020-01-31 11:30:51 +01:00
Henning Andersen 282ae8fd8c Increase log level for failing AbstractDisruptionIT tests (#51462)
Increase log level for two failing tests to include trace logging for
PersistedClusterStateService.

Relates #51329
2020-01-31 08:30:09 +01:00
Ioannis Kakavas 46ffc57abe Fix test compilation error 2020-01-31 09:23:02 +02:00
David Turner 72ae0ca73f Log exceptions in TcpTransport at DEBUG level (#51612)
When running Elasticsearch on a flaky network, we may see nodes leaving the
cluster with reason `disconnected`. It may be useful to the cluster
administrator to see the full exception that caused the disconnection, but this
is only available with `TRACE` level logging which commingles the details of
the problem with other messages that are not useful to end users.

This commit promotes logging of exceptions in `TcpTransport` from `TRACE` to
`DEBUG` to separate them from the truly `TRACE`-level messages.
2020-01-31 01:36:58 +00:00
Gordon Brown 10c8179351
Use exclusions list instead of fake system indices (#51586)
This commit switches the strategy for managing dot-prefixed indices that
should be hidden indices from using "fake" system indices to an explicit
exclusions list that must be updated when those indices are converted to
hidden indices.
2020-01-30 16:31:27 -07:00
Armin Braun 9c7a63214c
Fix InternalEngineTests.testSeqNoAndCheckpoints (#51630) (#51671)
* Fix InternalEngineTests.testSeqNoAndCheckpoints

If we force flush while possibly triggering a merge the local checkpoint may change
from the expectation from the loop that just increments on every operation.

Closes #51604
2020-01-30 15:42:48 +01:00
Nhat Nguyen f0fad5b622
Deprecate translog retention settings (#51588) (#51638)
This change deprecates the translog retention settings as they are
effectively ignored since 7.4.

Relates #50775
Relates #45473
2020-01-30 09:03:10 -05:00
Armin Braun 1064009e9d
Allow Parallel Snapshot Restore And Delete (#51608) (#51666)
There is no reason not to allow deletes in parallel to restores
if they're dealing with different snapshots.
A delete will not remove any files related to the snapshot that
is being restored if it is different from the deleted snapshot
because those files will still be referenced by the restoring
snapshot.
Loading RepositoryData concurrently to modifying it is concurrency
safe nowadays as well since the repo generation is tracked in the
cluster state.

Closes #41463
2020-01-30 14:27:05 +01:00
Henning Andersen 2e8a2c4baf
Fix ActionListener.map exception handling (#50886) (#51642)
ActionListener.map would call listener.onFailure for exceptions from
listener.onResponse, but this means we could double trigger some
listeners which is generally unexpected. Instead, we should assume that
a listener's onResponse (and onFailure) implementation is responsible
for its own exception handling.
2020-01-30 12:54:55 +01:00
Ryan Ernst cf5a2269a5 Fix stderr to also be captured by log4j (#51569)
In #50259 we redirected stdout and stderr to log4j, to capture jdk
and external library messages. However, a typo in the method name used
to redirect the stream in java means stdout is currently being
duplicated twice, and stderr not captured. This commit corrects that
mistake. Unfortunately this is at a level that cannot really be tested,
thus we are still missing tests for this behavior.
2020-01-29 16:37:56 -08:00
Julie Tibshirani 40f4f2d267 Avoid processing search profile results twice. (#51575)
Just a small clean-up, not motivated by performance.
2020-01-29 14:37:39 -08:00
Nhat Nguyen 316fba0c67 Ensure warm up engine in testTranslogReplayWithFailure
We need to warm up the engine (i.e., perform an external refresh) before
accessing the external refresh. Note that we refresh externally before
allowing reading from a shard.

Relates #48605
Closes #51548
2020-01-28 21:49:21 -05:00
Jason Tedor b080237837
Ignore virtual ethernet devices that disappear (#51581)
When checking if a device is up, today we can run into virtual ethernet
devices that disappear while we are in the middle of checking. This
leads to "no such device". This commit addresses such devices by
treating them as not being up, if they are virtual ethernet devices that
disappeared while we were checking.
2020-01-28 18:44:36 -05:00
Jim Ferenczi 77f4aafaa2 Expose the logic to cancel task when the rest channel is closed (#51423)
This commit moves the logic that cancels search requests when the rest channel is closed
to a generic client that can be used by other APIs. This will be useful for any rest action
that wants to cancel the execution of a task if the underlying rest channel is closed by the
client before completion.

Relates #49931
Relates #50990
Relates #50990
2020-01-28 22:55:42 +01:00
Armin Braun aae93a7578
Allow Repository Plugins to Filter Metadata on Create (#51472) (#51542)
* Allow Repository Plugins to Filter Metadata on Create

Add a hook that allows repository plugins to filter the repository metadata
before it gets written to the cluster state.
2020-01-28 18:33:26 +01:00
David Roberts b8adb59e4a [TEST] Mute ReloadSecureSettingsIT.testReloadAllNodesWithPasswordWithoutTLSFails
Due to https://github.com/elastic/elasticsearch/issues/51546
2020-01-28 17:14:36 +00:00
Gordon Brown 89c2834b24
Deprecate creation of dot-prefixed index names except for hidden and system indices (#49959)
This commit deprecates the creation of dot-prefixed index names (e.g.
.watches) unless they are either 1) a hidden index, or 2) registered by
a plugin that extends SystemIndexPlugin. This is the first step
towards more thorough protections for system indices.

This commit also modifies several plugins which use dot-prefixed indices
to register indices they own as system indices, and adds a plugin to
register .tasks as a system index.
2020-01-28 10:01:16 -07:00
William Brafford 9efa5be60e
Password-protected Keystore Feature Branch PR (#51123) (#51510)
* Reload secure settings with password (#43197)

If a password is not set, we assume an empty string to be
compatible with previous behavior.
Only allow the reload to be broadcast to other nodes if TLS is
enabled for the transport layer.

* Add passphrase support to elasticsearch-keystore (#38498)

This change adds support for keystore passphrases to all subcommands
of the elasticsearch-keystore cli tool and adds a subcommand for
changing the passphrase of an existing keystore.
The work to read the passphrase in Elasticsearch when
loading, which will be addressed in a different PR.

Subcommands of elasticsearch-keystore can handle (open and create)
passphrase protected keystores

When reading a keystore, a user is only prompted for a passphrase
only if the keystore is passphrase protected.

When creating a keystore, a user is allowed (default behavior) to create one with an
empty passphrase

Passphrase can be set to be empty when changing/setting it for an
existing keystore

Relates to: #32691
Supersedes: #37472

* Restore behavior for force parameter (#44847)

Turns out that the behavior of `-f` for the add and add-file sub
commands where it would also forcibly create the keystore if it
didn't exist, was by design - although undocumented.
This change restores that behavior auto-creating a keystore that
is not password protected if the force flag is used. The force
OptionSpec is moved to the BaseKeyStoreCommand as we will presumably
want to maintain the same behavior in any other command that takes
a force option.

*  Handle pwd protected keystores in all CLI tools  (#45289)

This change ensures that `elasticsearch-setup-passwords` and
`elasticsearch-saml-metadata` can handle a password protected
elasticsearch.keystore.
For setup passwords the user would be prompted to add the
elasticsearch keystore password upon running the tool. There is no
option to pass the password as a parameter as we assume the user is
present in order to enter the desired passwords for the built-in
users.
For saml-metadata, we prompt for the keystore password at all times
even though we'd only need to read something from the keystore when
there is a signing or encryption configuration.

* Modify docs for setup passwords and saml metadata cli (#45797)

Adds a sentence in the documentation of `elasticsearch-setup-passwords`
and `elasticsearch-saml-metadata` to describe that users would be
prompted for the keystore's password when running these CLI tools,
when the keystore is password protected.

Co-Authored-By: Lisa Cawley <lcawley@elastic.co>

* Elasticsearch keystore passphrase for startup scripts (#44775)

This commit allows a user to provide a keystore password on Elasticsearch
startup, but only prompts when the keystore exists and is encrypted.

The entrypoint in Java code is standard input. When the Bootstrap class is
checking for secure keystore settings, it checks whether or not the keystore
is encrypted. If so, we read one line from standard input and use this as the
password. For simplicity's sake, we allow a maximum passphrase length of 128
characters. (This is an arbitrary limit and could be increased or eliminated.
It is also enforced in the keystore tools, so that a user can't create a
password that's too long to enter at startup.)

In order to provide a password on standard input, we have to account for four
different ways of starting Elasticsearch: the bash startup script, the Windows
batch startup script, systemd startup, and docker startup. We use wrapper
scripts to reduce systemd and docker to the bash case: in both cases, a
wrapper script can read a passphrase from the filesystem and pass it to the
bash script.

In order to simplify testing the need for a passphrase, I have added a
has-passwd command to the keystore tool. This command can run silently, and
exit with status 0 when the keystore has a password. It exits with status 1 if
the keystore doesn't exist or exists and is unencrypted.

A good deal of the code-change in this commit has to do with refactoring
packaging tests to cleanly use the same tests for both the "archive" and the
"package" cases. This required not only moving tests around, but also adding
some convenience methods for an abstraction layer over distribution-specific
commands.

* Adjust docs for password protected keystore (#45054)

This commit adds relevant parts in the elasticsearch-keystore
sub-commands reference docs and in the reload secure settings API
doc.

* Fix failing Keystore Passphrase test for feature branch (#50154)

One problem with the passphrase-from-file tests, as written, is that
they would leave a SystemD environment variable set when they failed,
and this setting would cause elasticsearch startup to fail for other
tests as well. By using a try-finally, I hope that these tests will fail
more gracefully.

It appears that our Fedora and Ubuntu environments may be configured to
store journald information under /var rather than under /run, so that it
will persist between boots. Our destructive tests that read from the
journal need to account for this in order to avoid trying to limit the
output we check in tests.

* Run keystore management tests on docker distros (#50610)

* Add Docker handling to PackagingTestCase

Keystore tests need to be able to run in the Docker case. We can do this
by using a DockerShell instead of a plain Shell when Docker is running.

* Improve ES startup check for docker

Previously we were checking truncated output for the packaged JDK as
an indication that Elasticsearch had started. With new preliminary
password checks, we might get a false positive from ES keystore
commands, so we have to check specifically that the Elasticsearch
class from the Bootstrap package is what's running.

* Test password-protected keystore with Docker (#50803)

This commit adds two tests for the case where we mount a
password-protected keystore into a Docker container and provide a
password via a Docker environment variable.

We also fix a logging bug where we were logging the identifier for an
array of strings rather than the contents of that array.

* Add documentation for keystore startup prompting (#50821)

When a keystore is password-protected, Elasticsearch will prompt at
startup. This commit adds documentation for this prompt for the archive,
systemd, and Docker cases.

Co-authored-by: Lisa Cawley <lcawley@elastic.co>

* Warn when unable to upgrade keystore on debian (#51011)

For Red Hat RPM upgrades, we warn if we can't upgrade the keystore. This
commit brings the same logic to the code for Debian packages. See the
posttrans file for gets executed for RPMs.

* Restore handling of string input

Adds tests that were mistakenly removed. One of these tests proved
we were not handling the the stdin (-x) option correctly when no
input was added. This commit restores the original approach of
reading stdin one char at a time until there is no more (-1, \r, \n)
instead of using readline() that might return null

* Apply spotless reformatting

* Use '--since' flag to get recent journal messages

When we get Elasticsearch logs from journald, we want to fetch only log
messages from the last run. There are two reasons for this. First, if
there are many logs, we might get a string that's too large for our
utility methods. Second, when we're looking for a specific message or
error, we almost certainly want to look only at messages from the last
execution.

Previously, we've been trying to do this by clearing out the physical
files under the journald process. But there seems to be some contention
over these directories: if journald writes a log file in between when
our deletion command deletes the file and when it deletes the log
directory, the deletion will fail.

It seems to me that we might be able to use journald's "--since" flag to
retrieve only log messages from the last run, and that this might be
less likely to fail due to race conditions in file deletion.

Unfortunately, it looks as if the "--since" flag has a granularity of
one-second. I've added a two-second sleep to make sure that there's a
sufficient gap between the test that will read from journald and the
test before it.

* Use new journald wrapper pattern

* Update version added in secure settings request

Co-authored-by: Lisa Cawley <lcawley@elastic.co>
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
2020-01-28 05:32:32 -05:00
Tal Levy fc2d875c9f Fix geogrid with bounds test edge cases (#51118)
This commit modifies the bounding box for geogrid unit tests
to only consider bounding boxes that have significant longitudinal
width and whose coordinates are normalized to quantized space

Closes #51103.
2020-01-27 12:16:18 -08:00
Nik Everett 4ff314a9d5
Begin moving date_histogram to offset rounding (take two) (#51271) (#51495)
We added a new rounding in #50609 that handles offsets to the start and
end of the rounding so that we could support `offset` in the `composite`
aggregation. This starts moving `date_histogram` to that new offset.

This is a redo of #50873 with more integration tests.

This reverts commit d114c9db3e1d1a766f9f48f846eed0466125ce83.
2020-01-27 13:40:54 -05:00
Nik Everett faf6bb27be
Support time_zone on composite's date_histogram (#51172) (#51491)
We've been parsing the `time_zone` parameter on `date_hitogram` for a
while but it hasn't *done* anything. This wires it up.

Closes #45199
Inspired by #45200
2020-01-27 12:44:48 -05:00
Armin Braun 2eeea21d84
Use Consistent ClusterState throughout Snapshot API Calls (#51464) (#51471)
We shouldn't be using potentially changing versions of the cluster state
when answering a snapshot status API call by calling `SnapshotService#currentSnapshots` multiple times (each time using `ClusterService#state` under the hood) but instead pass down the state from the transport action.
Having these API behave more in a more deterministic way will make it easier to use them once parallel repository operations
are introduced.
2020-01-27 13:28:17 +01:00
Ryan Ernst fbfc5a327c Fix checkstyle and test for logging output stream
Commit a156629b didn't quite fix the botched backport. This commit does.
2020-01-25 14:13:47 -08:00
Ryan Ernst a156629b4a Fix backport compilation
Java 8 doesn't have a PrintStream ctor which takes a Charset object,
only a charset name. This fixes that.
2020-01-25 11:13:33 -08:00
Ryan Ernst a564cac7ba Capture stdout and stderr to log4j log (#50259)
This commit overrides the stdout and stderr print streams to be
redirected to the main elasticsearch.log file. While the Elasticsearch
project ensures stdout and stderr are not written to, the jdk or 3rd
party libs may do this, which can be unexepected for users used to
looking the elasticsearch log.

closes #50156
2020-01-25 11:00:01 -08:00
Armin Braun ef94a5863a
Fix RareClusterStateIT Cancelling Publication too Early (#51429) (#51434)
Wait for the cluster to have settled down and have the same accepted version on all nodes before
executing and cancelling request so that a slow CS accept on one node doesn't make it fall behind
and then get sent the full CS because of the diff-version mismatch, breaking the mechanics of this test.

Closes #51308
2020-01-24 20:33:45 +01:00
Armin Braun af1ff52e70
Fix TransportMasterNodeAction not Retrying NodeClosedException (#51325) (#51437)
Added node closed exception to the retryable remote exceptions as it's possible to run into this exception instead of a connect exception when the master node is just shutting down but still responding to requests.
2020-01-24 20:33:13 +01:00
Armin Braun f0d8c785e3
Fix Inconsistent Shard Failure Count in Failed Snapshots (#51416) (#51426)
* Fix Inconsistent Shard Failure Count in Failed Snapshots

This fix was necessary to allow for the below test enhancement:
We were not adding shard failure entries to a failed snapshot for those
snapshot entries that were never attempted because the snapshot failed during
the init stage and wasn't partial. This caused the never attempted snapshots
to be counted towards the successful shard count which seems wrong and
broke repository consistency tests.

Also, this change adjusts snapshot resiliency tests to run another snapshot
at the end of each test run to guarantee a correct `index.latest` blob exists
after each run.

Closes #47550
2020-01-24 18:20:47 +01:00
Ryan Ernst 144e037941 Ignore order of indices in hidden indices test (#51383)
The order indices are returned in in the metadata is not guaranteed.
This commit accounts for any possible ordering in assertions about
hidden indices.

closes #51340
2020-01-23 15:49:32 -08:00
Nhat Nguyen 1ca5dd13de
Flush instead of synced-flush inactive shards (#51365)
If all nodes are on 7.6, we prefer to perform a normal flush 
instead of synced flush when a shard becomes inactive.

Backport of #49126
2020-01-23 17:20:57 -05:00
David Turner 0152c40724
Log when probe succeeds but full connection fails (#51304) (#51357)
It is permitted for nodes to accept transport connections at addresses other
than their publish address, which allows a good deal of flexibility when
configuring discovery. However, it is not unusual for users to misconfigure
nodes to pick a publish address which is inaccessible to other nodes. We see
this happen a lot if the nodes are on different networks separated by a proxy,
or if the nodes are running in Docker with the wrong kind of network config.

In this case we offer no useful feedback to the user unless they enable
TRACE-level logs. It's particularly tricky to diagnose because if we test
connectivity between the nodes (using their discovery addresses) then all will
appear well.

This commit adds a WARN-level log if this kind of misconfiguration is detected:
the probe connection has succeeded (to indicate that we are really talking to a
healthy Elasticsearch node) but the followup connection attempt fails.

It also tidies up some loose ends in `HandshakingTransportAddressConnector`,
removing some TODOs that need not be completed, and registering its
accidentally-unregistered timeout settings.
2020-01-23 17:27:59 +00:00
Nhat Nguyen acf84b68cb Do not wrap soft-deletes reader for segment stats (#51331)
IndexWriter might not filter out fully deleted segments if retention
leases exist or the number of the retaining operations is non-zero.
SoftDeletesDirectoryReaderWrapper, however, always filters out fully
deleted segments.

This change uses the original directory reader when calculating segment
stats instead.

Relates #51192
Closes #51303
2020-01-23 08:43:06 -05:00
Armin Braun 4e8ab43a3e
Simplify Snapshot Initialization (#51256) (#51344)
We were loading `RepositoryData` twice during snapshot initialization,
redundantly checking if a snapshot existed already.
The first snapshot existence check is somewhat redundant because a snapshot could be
created between loading `RepositoryData` and updating the cluster state with the `INIT`
state snapshot entry.
Also, it is much safer to do the subsequent checks for index existence in the repo and
and the presence of old version snapshots once the `INIT` state entry prevents further
snapshots from being created concurrently.
While the current state of things will never lead to corruption on a concurrent snapshot
creation, it could result in a situation (though unlikely) where all the snapshot's work
is done on the data nodes, only to find out that the repository generation was off during
snapshot finalization, failing there and leaving a bunch of dead data in the repository
that won't be used in a subsequent snapshot (because the shard generation was never referenced
due to the failed snapshot finalization).

Note: This is a step on the way to parallel repository operations by making snapshot related CS
and repo related CS more tightly correlated.
2020-01-23 14:29:35 +01:00
Henning Andersen 8c8d0dbacc Revert "Workaround for JDK 14 EA FileChannel.map issue (#50523)" (#51323)
This reverts commit c7fd24ca1569a809b499caf34077599e463bb8d6.

Now that JDK-8236582 is fixed in JDK 14 EA, we can revert the workaround.

Relates #50523 and #50512
2020-01-23 11:27:02 +01:00
Nhat Nguyen 157b352b47 Exclude nested documents in LuceneChangesSnapshot (#51279)
LuceneChangesSnapshot can be slow if nested documents are heavily used.
Also, it estimates the number of operations to be recovered in peer
recoveries inaccurately. With this change, we prefer excluding the
nested non-root documents in a Lucene query instead.
2020-01-22 17:33:28 -05:00
Nirmal Chidambaram f193e527a1 Short circuited to MatchNone for non-participating slice (#51207)
In case of numSlices = numShards, use a MatchNone query
instead of boolean with a MatchNone - MUST clause

Backport for #51207
2020-01-22 17:04:53 -05:00
Igor Motov 08e9c673e5 Fix leftover mentions of method parameter in Percentile Aggs (#51272)
The method parameter is not used in the percentile aggs, instead
the method is determined by the presence of `hdr` or `tdigest`
objects.

Relates to #8324
2020-01-22 10:03:35 -05:00
Andrei Dan 123266714b
ILM wait for active shards on rolled index in a separate step (#50718) (#51296)
After we rollover the index we wait for the configured number of shards for the
rolled index to become active (based on the index.write.wait_for_active_shards
setting which might be present in a template, or otherwise in the default case,
for the primaries to become active).
This wait might be long due to disk watermarks being tripped, replicas not
being able to spring to life due to cluster nodes reconfiguration and others
and, the RolloverStep might not complete successfully due to this inherent
transient situation, albeit the rolled index having been created.

(cherry picked from commit 457a92fb4c68c55976cc3c3e2f00a053dd2eac70)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-22 11:01:52 +00:00
Armin Braun 7b4c2bfdc4
Fix Overly Optimistic Request Deduplication (#51270) (#51291)
On master failover we have to resent all the shard failed messages,
but the transport requests remain the same in the eyes of `equals`.
If the master failover is registered and the requests to the new master
are sent before all the callbacks have executed and the request to the
old master removed from the deduplicator then the requuests to the new
master will incorrectly fail and the snapshot get stuck.

Closes #51253
2020-01-22 11:38:40 +01:00
Stuart Tettemer 41c15b438d
Scripting: Add char position of script errors (#51069) (#51266)
Add the character position of a scripting error to error responses.

The contents of the `position` field are experimental and subject to
change.  Currently, `offset` refers to the character location where the
error was encountered, `start` and `end` define a range of characters
that contain the error.

eg.
```
{
  "error": {
    "root_cause": [
      {
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [
          "y = x;",
          "     ^---- HERE"
        ],
        "script": "def x = new ArrayList(); Map y = x;",
        "lang": "painless",
        "position": {
          "offset": 33,
          "start": 29,
          "end": 35
        }
      }
```

Refs: #50993
2020-01-21 13:45:59 -07:00
Nik Everett ca15a3f5a8
Add "did you mean" to unknown queries (#51177) (#51254)
This replaces the message we return for unknown queries with the standard
one that we use for unknown fields from `ObjectParser`. This is nice
because it includes "did you mean". One day we might convert parsing
queries to using object parser, but that looks complex. This change is
much smaller and seems useful.
2020-01-21 12:45:52 -05:00
Paul Sanwald d186e470ff version bump for 7.5.3 2020-01-21 10:59:55 -05:00
Nik Everett 788836ea3f
Revert "Begin moving date_histogram to offset rounding (backport of #50873) (#50978)" (#51239)
This reverts commit 9a3d4db840. It was
subtly broken in ways we didn't have tests for.
2020-01-21 08:50:02 -05:00
Przemyslaw Gomulka 0513d8dca3
Add SPI jvm option to SystemJvmOptions (#50916)
Adding back accidentally removed jvm option that is required to enforce
start of the week = Monday in IsoCalendarDataProvider.
Adding a `feature` to yml test in order to skip running it in JDK8

commit that removed it 398c802
commit that backports SystemJvmOptions c4fbda3
relates 7.x backport of code that enforces CalendarDataProvider use #48349
2020-01-21 09:02:21 +01:00
Nhat Nguyen 43ed244a04
Account soft-deletes in FrozenEngine (#51192) (#51229)
Currently, we do not exclude soft-deleted documents when opening index
reader in the FrozenEngine.

Backport of #51192
2020-01-20 17:07:29 -05:00
zacharymorn dc02458dd6 Exclude unmapped fields from query max_clause limit (#49523)
Take into account of number of unmapped fields when calculating against limit.
Closes #49002
2020-01-20 13:44:33 +01:00
Armin Braun 694b8ab95d
Fix CorruptedBlobStoreRepository Test (#51128) (#51186)
The tests, when creating broken serialized blobs could randomly create
a sequence of bytes that is partially readable by the deserializer and then
not throw `IOException` but instead `ElasticsearchParseException`.
We should just handle these unexpected exceptions downstream properly and pass them
wrapped as `RepositoryException` to the listener to fix the test and keep the API consistent.
2020-01-18 14:12:55 +01:00
Jay Modi 107989df3e
Introduce hidden indices (#51164)
This change introduces a new feature for indices so that they can be
hidden from wildcard expansion. The feature is referred to as hidden
indices. An index can be marked hidden through the use of an index
setting, `index.hidden`, at creation time. One primary use case for
this feature is to have a construct that fits indices that are created
by the stack that contain data used for display to the user and/or
intended for querying by the user. The desire to keep them hidden is
to avoid confusing users when searching all of the data they have
indexed and getting results returned from indices created by the
system.

Hidden indices have the following properties:
* API calls for all indices (empty indices array, _all, or *) will not
  return hidden indices by default.
* Wildcard expansion will not return hidden indices by default unless
  the wildcard pattern begins with a `.`. This behavior is similar to
  shell expansion of wildcards.
* REST API calls can enable the expansion of wildcards to hidden
  indices with the `expand_wildcards` parameter. To expand wildcards
  to hidden indices, use the value `hidden` in conjunction with `open`
  and/or `closed`.
* Creation of a hidden index will ignore global index templates. A
  global index template is one with a match-all pattern.
* Index templates can make an index hidden, with the exception of a
  global index template.
* Accessing a hidden index directly requires no additional parameters.

Backport of #50452
2020-01-17 10:09:01 -07:00
Nik Everett 5299664ae3
"did you mean" for ObjectParser with top named (#51018) (#51165)
When you declare an ObjectParser with top level named objects like we do
with `significant_terms` we didn't support "did you mean". This fixes
that.

Relates #50938
2020-01-17 12:00:03 -05:00
Armin Braun e51b209dd3
Fix Infinite Retry Loop in loading RepositoryData (#50987) (#51093)
* Fix Infinite Retry Loop in loading RepositoryData

We were running into an infinite loop when trying to load corrupted (or otherwise un-loadable)
repository data for a repo that uses best effort consistency (e.g. that was just freshly mounted
as done in the test) because we kepy resetting to `-1` on `IOException`, listing and finding the broken
generation `N` and then interpreted the subsequent reset to `-1` as a concurrent change to the repository.
2020-01-16 21:08:35 +01:00
Nik Everett f6c89b4599
Move test of custom sig heuristic to plugin (#50891) (#51067)
This moves the testing of custom significance heuristic plugins from an
`ESIntegTestCase` to an example plugin. This is *much* more "real" and
can be used as an example for anyone that needs to actually build such a
plugin. The old test had testing concerns and the example all jumbled
together.
2020-01-16 14:49:12 -05:00
Zachary Tong e2ca93bad3 Mute GeoGridAggregatorTestCase#testBounds()
Tracking issue: https://github.com/elastic/elasticsearch/issues/51103
2020-01-16 10:28:10 -05:00
Marios Trivyzas fda25ed04a
Fix caching for PreConfiguredTokenFilter (#50912) (#51091)
The PreConfiguredTokenFilter#singletonWithVersion uses the version
internally for the token filter factories but it registers only one
instance in the cache and not one instance per version. This can lead
to exceptions like the one described in #50734 since the singleton is
created and cached using the version created of the first index
that is processed.

Remove the singletonWithVersion() methods and use the
elasticsearchVersion() methods instead.

Fixes: #50734
(cherry picked from commit 24e1858)
2020-01-16 13:58:02 +01:00
Martijn van Groningen 02dfd71efa
Backport: Add pipeline name to ingest metadata (#51050)
Backport: #50467

This commit adds the name of the current pipeline to ingest metadata.
This pipeline name is accessible under the following key: '_ingest.pipeline'.

Example usage in pipeline:
PUT /_ingest/pipeline/2
{
    "processors": [
        {
            "set": {
                "field": "pipeline_name",
                "value": "{{_ingest.pipeline}}"
            }
        }
    ]
}

Closes #42106
2020-01-16 10:50:47 +01:00
Zachary Tong 9eab87062c Bump version to 7.7.0 2020-01-15 10:12:14 -05:00
Yannick Welsch dc47b380c8 Block too many concurrent mapping updates (#51038)
Ensures that there are not too many concurrent dynamic mapping updates going out from the
data nodes to the master.

Closes #50670
2020-01-15 16:02:11 +01:00
Alan Woodward 0257de8c26
Emit warnings when index templates have multiple mappings (#50982)
Index templates created in the 5x line can still be present in the cluster state
through multiple upgrades, and may have more than one mapping defined. 8x
will stop supporting templates with multiple mappings, and we should emit
deprecation warnings in 7x clusters to give users a chance to update their
templates before upgrading.
2020-01-15 11:59:27 +00:00
Nik Everett fc5fde7950
Add "did you mean" to ObjectParser (#50938) (#50985)
Check it out:
```
$ curl -u elastic:password -HContent-Type:application/json -XPOST localhost:9200/test/_update/foo?pretty -d'{
  "dac": {}
}'

{
  "error" : {
    "root_cause" : [
      {
        "type" : "x_content_parse_exception",
        "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?"
      }
    ],
    "type" : "x_content_parse_exception",
    "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?"
  },
  "status" : 400
}
```

The tricky thing about implementing this is that x-content doesn't
depend on Lucene. So this works by creating an extension point for the
error message using SPI. Elasticsearch's server module provides the
"spell checking" implementation.
s
2020-01-14 17:53:41 -05:00
Yannick Welsch 4b0581f182 Remove custom metadata tool (#50813)
Adds a command-line tool to remove broken custom metadata from the cluster state.

Relates to #48701
2020-01-14 23:08:33 +01:00
Nik Everett a8aca6b2a0
Switch AggregationSpec to ContextParser (#50871) (#50980)
We seem to have settled on the `ContextParser` interface for parsing
stuff, mostly because `ObjectParser` implements it. We don't really
*need* the old `Aggregator.Parser` interface any more because it
duplicates `ContextParser` but with the arguments reversed.

This adds support to `AggregationSpec` to declare aggregation parsers
using `ContextParser`. This should integrate cleanly with
`ObjectParser`. It doesn't drop support for `Aggregator.Parser` or
change the plugin intrface at all so it *should* be safe to backport to
7.x. And we can remove `Aggregator.Parser` in a follow up which is only
targeted to 8.0.
2020-01-14 16:50:52 -05:00
Nik Everett 9a3d4db840
Begin moving date_histogram to offset rounding (backport of #50873) (#50978)
We added a new rounding in #50609 that handles offsets to the start and
end of the rounding so that we could support `offset` in the `composite`
aggregation. This starts moving `date_histogram` to that new offset.
2020-01-14 16:50:27 -05:00
Tal Levy 9ee2e11181
[7.x] Adds support for geo-bounds filtering in geogrid aggregations (#50996)
* Adds support for geo-bounds filtering in geogrid aggregations (#50002)

It is fairly common to filter the geo point candidates in
geohash_grid and geotile_grid aggregations according to some
viewable bounding box. This change introduces the option of
specifying this filter directly in the tiling aggregation.

This is even more relevant to `geo_shape` where the bounds will restrict
the shape to be within the bounds

this optional `bounds` parameter is parsed in an equivalent fashion to
the bounds specified in the geo_bounding_box query.
2020-01-14 11:18:46 -08:00
Armin Braun 16c07472e5
Track Snapshot Version in RepositoryData (#50930) (#50989)
* Track Snapshot Version in RepositoryData (#50930)

Add tracking of snapshot versions to RepositoryData to make BwC logic more efficient.
Follow up to #50853
2020-01-14 18:15:07 +01:00
Tim Brooks 6e7478b846
Allow proxy mode server name to be configured (#50951)
Currently, proxy mode allows a remote cluster connection to be setup by
expecting all open connections to be routed through an intermediate
proxy. The proxy must use some logic to ensure that the connections end
up on the correct remote cluster. One mechanism provided is that the
default distribution TLS implementations will forward the host component
of the configured address to the remote connection using the SNI
extension. This is limiting as it requires that the proxy be configured
in a way that always uses a valid hostname as the proxy address.

Instead, this commit adds an additional setting to allow the server_name
to be configured independently. This allows the proxy address to be
specified as a IP literal, but the server_name specified as an arbitrary
string which still must be a valid hostname. It also decouples the
server_name from the requirement of being a DNS resolvable domain.
2020-01-14 10:57:44 -06:00
Tim Brooks d8510be3d9
Revert "Send cluster name and discovery node in handshake (#48916)" (#50944)
This reverts commit 0645ee88e2.
2020-01-14 09:53:13 -06:00
Yannick Welsch f1c5031766 Fix queuing in AsyncLucenePersistedState (#50958)
The logic in AsyncLucenePersistedState was flawed, unexpectedly queuing up two update tasks
in parallel.
2020-01-14 15:04:28 +01:00
Yannick Welsch 91d7b446a0 Warn on slow metadata performance (#50956)
Has the new cluster state storage layer emit warnings in case metadata performance is very
slow.

Relates #48701
2020-01-14 15:04:28 +01:00
Alan Woodward 8c16725a0d Check for deprecations when analyzers are built (#50908)
Generally speaking, deprecated analysis components in elasticsearch will issue deprecation
warnings when they are first used. However, this means that no warnings are emitted when
indexes are created with deprecated components, and users have to actually index a document
to see warnings. This makes it much harder to see these warnings and act on them at
appropriate times.

This is worse in the case where components throw exceptions on upgrade. In this case, users
will not be aware of a problem until a document is indexed, instead of at index creation time.

This commit adds a new check that pushes an empty string through all user-defined analyzers
and normalizers when an IndexAnalyzers object is built for each index; deprecation warnings
and exceptions are now emitted when indexes are created or opened.

Fixes #42349
2020-01-14 13:52:02 +00:00
Yannick Welsch 22ba759e1f
Move metadata storage to Lucene (#50928)
* Move metadata storage to Lucene (#50907)

Today we split the on-disk cluster metadata across many files: one file for the metadata of each
index, plus one file for the global metadata and another for the manifest. Most metadata updates
only touch a few of these files, but some must write them all. If a node holds a large number of
indices then it's possible its disks are not fast enough to process a complete metadata update before timing out. In severe cases affecting master-eligible nodes this can prevent an election
from succeeding.

This commit uses Lucene as a metadata storage for the cluster state, and is a squashed version
of the following PRs that were targeting a feature branch:

* Introduce Lucene-based metadata persistence (#48733)

This commit introduces `LucenePersistedState` which master-eligible nodes
can use to persist the cluster metadata in a Lucene index rather than in
many separate files.

Relates #48701

* Remove per-index metadata without assigned shards (#49234)

Today on master-eligible nodes we maintain per-index metadata files for every
index. However, we also keep this metadata in the `LucenePersistedState`, and
only use the per-index metadata files for importing dangling indices. However
there is no point in importing a dangling index without any shard data, so we
do not need to maintain these extra files any more.

This commit removes per-index metadata files from nodes which do not hold any
shards of those indices.

Relates #48701

* Use Lucene exclusively for metadata storage (#50144)

This moves metadata persistence to Lucene for all node types. It also reenables BWC and adds
an interoperability layer for upgrades from prior versions.

This commit disables a number of tests related to dangling indices and command-line tools.
Those will be addressed in follow-ups.

Relates #48701

* Add command-line tool support for Lucene-based metadata storage (#50179)

Adds command-line tool support (unsafe-bootstrap, detach-cluster, repurpose, & shard
commands) for the Lucene-based metadata storage.

Relates #48701

* Use single directory for metadata (#50639)

Earlier PRs for #48701 introduced a separate directory for the cluster state. This is not needed
though, and introduces an additional unnecessary cognitive burden to the users.

Co-Authored-By: David Turner <david.turner@elastic.co>

* Add async dangling indices support (#50642)

Adds support for writing out dangling indices in an asynchronous way. Also provides an option to
avoid writing out dangling indices at all.

Relates #48701

* Fold node metadata into new node storage (#50741)

Moves node metadata to uses the new storage mechanism (see #48701) as the authoritative source.

* Write CS asynchronously on data-only nodes (#50782)

Writes cluster states out asynchronously on data-only nodes. The main reason for writing out
the cluster state at all is so that the data-only nodes can snap into a cluster, that they can do a
bit of bootstrap validation and so that the shard recovery tools work.
Cluster states that are written asynchronously have their voting configuration adapted to a non
existing configuration so that these nodes cannot mistakenly become master even if their node
role is changed back and forth.

Relates #48701

* Remove persistent cluster settings tool (#50694)

Adds the elasticsearch-node remove-settings tool to remove persistent settings from the on
disk cluster state in case where it contains incompatible settings that prevent the cluster from
forming.

Relates #48701

* Make cluster state writer resilient to disk issues (#50805)

Adds handling to make the cluster state writer resilient to disk issues. Relates to #48701

* Omit writing global metadata if no change (#50901)

Uses the same optimization for the new cluster state storage layer as the old one, writing global
metadata only when changed. Avoids writing out the global metadata if none of the persistent
fields changed. Speeds up server:integTest by ~10%.

Relates #48701

* DanglingIndicesIT should ensure node removed first (#50896)

These tests occasionally failed because the deletion was submitted before the
restarting node was removed from the cluster, causing the deletion not to be
fully acked. This commit fixes this by checking the restarting node has been
removed from the cluster.

Co-authored-by: David Turner <david.turner@elastic.co>

* fix tests

Co-authored-by: David Turner <david.turner@elastic.co>
2020-01-14 09:35:43 +01:00
Tim Brooks 50cb770315
Use default profile for remote connections (#50947)
Currently, the connection manager is configured with a default profile
for both the sniff and proxy connection stratgies. This profile
correctly reflects the expected number of connection (6 for sniff, 18
for proxy). This commit removes the proxy strategy usages of the per
connection attempt profile configuration.

Additionally, it refactors other unnecessary code around the connection
manager. The connection manager now can always be built inside the
remote connection.
2020-01-13 21:46:23 -06:00
Tim Brooks 27c2eb744e
Fix open/close race in ConnectionManagerTests (#50621)
Currently we reuse the same test connection for all connection attempts
in the testConcurrentConnectsAndDisconnects test. This means that if the
connection fails due to a pre-existing connection, the connection will
be closed impacting the state of all connection attempts. This commit
fixes the test, by returning a unique connection for each attempt.

Fixes #49903.
2020-01-13 18:43:18 -07:00
Nhat Nguyen fb32a55dd5 Deprecate synced flush (#50835)
A normal flush has the same effect as a synced flush on Elasticsearch 
7.6 or later. It's deprecated in 7.6 and will be removed in 8.0.

Relates #50776
2020-01-13 19:54:38 -05:00
Nhat Nguyen 05f97d5e1b Revert "Deprecate synced flush (#50835)"
This reverts commit 1a32d7142a.
2020-01-13 11:41:03 -05:00
Nhat Nguyen 1a32d7142a
Deprecate synced flush (#50835)
A normal flush has the same effect as a synced flush on Elasticsearch 
7.6 or later. It's deprecated in 7.6 and will be removed in 8.0.

Relates #50776
2020-01-13 10:58:29 -05:00
Armin Braun 609b015e3c
Prevent Old Version Clusters From Corrupting Snapshot Repositories (#50853) (#50913)
Follow up to #50692 that starts writing a `min_version` field to
the `RepositoryData` so that pre-7.6 ES versions can not read it
(and potentially corrupt it if they attempt to modify the repo contents)
after the repository moved to the new metadata format.
2020-01-13 15:02:53 +01:00
Christoph Büscher c31a21c3d8
Fix time zone issue in Rounding serialization (#50845)
When deserializing time zones in the Rounding classes we used to include a tiny
normalization step via `DateUtils.of(in.readString())` that was lost in #50609.
Its at least necessary for some tests, e.g. the cause of #50827 is that when
sending the default time zone ZoneOffset.UTC on a stream pre 7.0 we convert it
to a "UTC" string id via `DateUtils.zoneIdToDateTimeZone`. This gets then read
back as a UTC ZoneRegion, which should behave the same but fails the equality
tests in our serialization tests. Reverting to the previous behaviour with an
additional normalization step on 7.x.

Co-authored-by: Nik Everett <nik9000@gmail.com>

Closes #50827
2020-01-13 10:10:15 +01:00
David Turner 456de59698 Fix non-corruption in testCurrentHeaderVersion (#50883)
Today we make multiple attempts to corrupt the translog header in
`TranslogHeaderTests#testCurrentHeaderVersion`, but if we are extraordinarily
unlucky then this sequence of corruptions may restore the file to its original
state. This change adjusts the test to only corrupt the file once, which is
certain not to leave the file in its original state.
2020-01-12 12:38:37 +00:00
Henning Andersen 2e5e5fd483 Fix testSkipRefreshIfShardIsRefreshingAlready (#50856)
The test checked queue size and active count, however,
ThreadPoolExecutor pulls out the request from the queue before marking
the worker active, risking that we think all tasks are done when they
are not. Now check on completed-tasks metric instead, which is
guaranteed to be monotonic.

Relates #50769
2020-01-11 11:21:05 -05:00
Nhat Nguyen f4aabdcd89 Do not force refresh when write indexing buffer (#50769)
Today we periodically check the indexing buffer memory every 5 seconds
or after we have used 1/30 of the configured memory. If the total used
memory is over the threshold, then we refresh the "largest" shards. If
refreshing takes longer these intervals (i.e., 5s or 1/30 buffer), then
we continue to enqueue refreshes to these shards. This leads to two
issues:

- The refresh thread pool can be exhausted and other shards can't refresh
- Execute too many refreshes for the "largest" shards

With this change, we only refresh the largest shards if they are not refreshing.
Here we rely on the periodic check to trigger another refresh if needed. We can
harden this by making the ongoing refresh triggers the memory check when
it's completed. I opted out this option in this PR for simplicity.

See: https://discuss.elastic.co/t/write-queue-continue-to-rise/213652/
2020-01-11 11:21:05 -05:00
Nik Everett e6d0f7df01
Fix format problem in composite of unmapped (#50869) (#50875)
When a composite aggregation is reduced using the results from an
index that has one of the fields unmapped we were throwing away the
formatter. This is mildly annoying, except in the case of IP addresses
which were coming out as non-utf-8-characters. And tripping assertions.

This carefully preserves the formatter from the working bucket.

Closes #50600
2020-01-10 16:18:11 -05:00
Jim Ferenczi 60308cf0b3 Fix upgrade of custom similarity (#50851)
This change fixes the upgrade of index metadata that contain
a custom similarity with options that are not compatible with BM25.
The upgrade doesn't need a real similarity service so we fake one that
resolves all custom similarity to BM25 but this logic fails because the
BM25 provider checks that all options are compatible. This commit removes
the verification step as it is not needed during the upgrade (the verification
is done when the index is restored/opened).

Closes #50763
2020-01-10 18:43:13 +01:00
Armin Braun 7e68989dae
Fix Snapshot Shard Status Request Deduplication (#50788) (#50840)
* Fix Snapshot Shard Status Request Deduplication

The request deduplication didn't actually work for these requests
since they had no `equals` and `hashCode` so the deduplicator wouldn't
actually recognize equal requests.
2020-01-10 11:49:52 +01:00
Christoph Büscher 75cb4e0b69 Muting InternalAggregationsTests.testSerialization 2020-01-10 09:24:09 +01:00
Nik Everett d021071ab9
Move scripted metric to ObjectParser (#50708) (#50811)
This replaces the hand rolled parsing code for scripted metric with
`ObjectParser` which is simpler to work with because it is declarative.
2020-01-09 16:09:21 -05:00
Nik Everett ae40e22452
Drop "funny" functions building parsers (#50715) (#50814)
Replaces the "funny"
`Function<String, ConstructingObjectParser<T, Void>>` with a much
simpler `ConstructingObjectParser<T, String>`. This makes pretty much
all of our object parsers static.
2020-01-09 15:53:03 -05:00
Armin Braun f70e8f6ab5
Fix Snapshot Repository Corruption in Downgrade Scenarios (#50692) (#50797)
* Fix Snapshot Repository Corruption in Downgrade Scenarios (#50692)

This PR introduces test infrastructure for downgrading a cluster while interacting with a given repository.
It fixes the fact that repository metadata in the new format could be written while there's still older snapshots in the repository that require the old-format metadata to be restorable.
2020-01-09 21:21:13 +01:00
Jake Landis de6f132887
[7.x] Foreach processor - fork recursive call (#50514) (#50773)
A very large number of recursive calls can cause a stack overflow
exception. This commit forks the recursive calls for non-async
processors. Once forked, each thread will handle at most 10
recursive calls to help keep the stack size and thread count
down to a reasonable size.
2020-01-09 13:21:18 -06:00
Nik Everett 1d8e51f89d
Support offset in composite aggs (#50609) (#50808)
Adds support for the `offset` parameter to the `date_histogram` source
of composite aggs. The `offset` parameter is supported by the normal
`date_histogram` aggregation and is useful for folks that need to
measure things from, say, 6am one day to 6am the next day.

This is implemented by creating a new `Rounding` that knows how to
handle offsets and delegates to other rounding implementations. That
implementation doesn't fully implement the `Rounding` contract, namely
`nextRoundingValue`. That method isn't used by composite aggs so I can't
be sure that any implementation that I add will be correct. I propose to
leave it throwing `UnsupportedOperationException` until I need it.

Closes #48757
2020-01-09 14:11:24 -05:00
Julie Tibshirani a299aba2f8
Ensure that field collapsing works with field aliases. (#50766)
Previously, the following situation would throw an error:
* A search contains a `collapse` on a particular field.
* The search spans multiple indices, and in one index the field is mapped as a
  concrete field, but in another it is a field alias.

The error occurs when we attempt to merge `CollapseTopFieldDocs` across shards.
When merging, we validate that the name of the collapse field is the same across
shards. But the name has already been resolved to the concrete field name, so it
will be different on shards where the field was mapped as an alias vs. shards
where it was a concrete field.

This PR updates the collapse field name in `CollapseTopFieldDocs` to the
original requested field, so that it will always be consistent across shards.

Note that in #32648, we already made a fix around collapsing on field aliases.
However, we didn't test this specific scenario where the field was mapped as an
alias in only one of the indices being searched.
2020-01-08 14:51:15 -08:00
Christoph Büscher b1b4282273 Make Multiplexer inherit filter chains analysis mode (#50662)
Currently, if an updateable synonym filter is included in a multiplexer filter,
it is not reloaded via the _reload_search_analyzers because the multiplexer
itself doesn't pass on the analysis mode of the filters it contains, so its not
recognized as "updateable" in itself. Instead we can check and merge the
AnalysisMode settings of all filters in the multiplexer and use the resulting
mode (e.g. search-time only) for the multiplexer itself, thus making any synonym
filters contained in it reloadable.  This, of course, will also make the
analyzers using the multiplexer be usable at search-time only.

Closes #50554
2020-01-08 22:12:01 +01:00
Przemyslaw Gomulka e95b0c447f
Allow parsing timezone without fully provided time backport(#50178) (#50740)
strict_date_optional_time changes to have optional minute part.
It already allowed optional second and fraction of second part.
This allows parsing 2018-01-01T00+01 , 2018-01-01T00:00+01 , 2018-01-01T00:00:00+01 , 2018-01-01T00:00:00.000+01
It won't allow parsing a timezone without an hour part as this is not allowed by iso8601 spec

closes #49351
2020-01-08 20:04:57 +01:00
Henning Andersen 125feecabc
Guess root cause support unwrap (#50525) (#50742)
ElasticsearchException.guessRootCauses would return wrapper exception if
inner exception was not an ElasticsearchException. Fixed to never return
wrapper exceptions.

At least following APIs change root_cause.0.type as a result:

_update with bad script
_index with bad pipeline

Relates #50417
2020-01-08 19:09:14 +01:00
Adrien Grand 4f2299c714
Upgrade to Lucene 8.4.0. (#50518) (#50750) 2020-01-08 18:53:59 +01:00
Adrien Grand 31158ab3d5
Add per-field metadata. (#50333)
This PR adds per-field metadata that can be set in the mappings and is later
returned by the field capabilities API. This metadata is completely opaque to
Elasticsearch but may be used by tools that index data in Elasticsearch to
communicate metadata about fields with tools that then search this data. A
typical example that has been requested in the past is the ability to attach
a unit to a numeric field.

In order to not bloat the cluster state, Elasticsearch requires that this
metadata be small:
 - keys can't be longer than 20 chars,
 - values can only be numbers or strings of no more than 50 chars - no inner
   arrays or objects,
 - the metadata can't have more than 5 keys in total.

Given that metadata is opaque to Elasticsearch, field capabilities don't try to
do anything smart when merging metadata about multiple indices, the union of
all field metadatas is returned.

Here is how the meta might look like in mappings:

```json
{
  "properties": {
    "latency": {
      "type": "long",
      "meta": {
        "unit": "ms"
      }
    }
  }
}
```

And then in the field capabilities response:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms" ]
      }
    }
  }
}
```

When there are no conflicts, values are arrays of size 1, but when there are
conflicts, Elasticsearch includes all unique values in this array, without
giving ways to know which index has which metadata value:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms", "ns" ]
      }
    }
  }
}
```

Closes #33267
2020-01-08 16:21:18 +01:00
Yannick Welsch f203c2b39d Import replicated closed dangling indices (#50649)
Dangling replicated closed indices are not imported properly (they miss their routing table when imported).
2020-01-08 13:39:20 +01:00
Rory Hunter b1ff74f652 New setting to prevent automatically importing dangling indices (#49174)
Introduce a new static setting, `gateway.auto_import_dangling_indices`, which prevents dangling indices from being automatically imported. Part of #48366.
2020-01-08 13:39:20 +01:00
Tim Vernum 293661d62c
Security should not reload files that haven't changed (#50724)
In security we currently monitor a set of files for changes:

- config/role_mapping.yml (or alternative configured path)
- config/roles.yml
- config/users
- config/users_roles

This commit prevents unnecessary reloading when the file change actually doesn't change the internal structure.

Backport of: #50207

Co-authored-by: Anton Shuvaev <anton.shuvaev91@gmail.com>
2020-01-08 15:13:47 +11:00
Nik Everett deb0991667
Teach ObjectParser a happy pattern (#50691) (#50710)
We *very* commonly have object with ctors like:
```
public Foo(String name)
```

And then declare a bunch of setters on the object. Every aggregation
works like this, for example. This change teaches `ObjectParser` how to
build these aggregations all on its own, without any help. This'll make
it much cleaner to parse aggs, and, probably, a bunch of other things.
It'll let us remove lots of wrapping. I've used this new power for the
`avg` aggregation just to prove that it works outside of a unit test.
2020-01-07 11:57:41 -05:00
Nhat Nguyen c3d207f437 Disable auto refresh in testSegmentsStats (#50689)
If an auto-refresh happens, then version_map_memory is reset to 0. By 
default, the auto-refresh occurs for every second in the first 30
seconds until search becomes idle.

Closes #50362
2020-01-07 10:44:30 -05:00
Hendrik Muhs 98ca9500e8
implement a workaround for remote cluster validation (#50460)
In 7.x an internal API used for validating remote cluster does not throw, see #50420 for the 
details. This change implements a workaround for remote cluster validation, only for 7.x branches.

fixes #50420
2020-01-07 13:51:51 +01:00
Yannick Welsch a2ef0e8830 Check allocation id when failing shard on recovery (#50656)
A failure of a recovering shard can race with a new allocation of the shard, and cause the new
allocation to be failed as well. This can result in a shard being marked as initializing in the cluster
state, but not exist on the node anymore.

Closes #50508
2020-01-07 09:41:28 +01:00
Jay Modi e5191e77e3
Remove unused IndicesOptions#fromByte method (#50683)
This change removes a no longer used method, `fromByte`, in
IndicesOptions. This method was necessary for backwards compatibility
with versions prior to 6.4.0 and was used when talking to those
versions. However, the minimum wire compatibility version has changed
and we no longer use this code.

Backport of #50665
2020-01-06 14:57:10 -07:00
Nik Everett 76bb661023
Replace AggParseContext with a String (backport of #50625) (#50679)
We used to have a *ton* off stuff in the `AggParseContext` but now we
parse aggs entirely with named xcontent. So we don't need the context
any more.
2020-01-06 14:32:03 -05:00
Nhat Nguyen 926c0aa74c Fix testRelocationEstablishedPeerRecoveryRetentionLeases (#50673)
The redNodes are calculated incorrectly.

Closes #50660
2020-01-06 13:32:04 -05:00
Nik Everett f576aefd0f
Replace bespoke parser for significance heuristics (#50623) (#50659)
This replaces the hand written xcontent parsers for significance
heristics with `ObjectParser` and parsing named xcontent.

As a happy accident, this was the last user of `ParseFieldRegistry` so
this PR entirely removes that class.

Closes #25519
2020-01-06 12:57:43 -05:00
Tim Brooks fa57813c6d
Remove races in ProxyConnectionStrategyTests (#50620)
Currently, we use delayed address resolution in the proxy strategy tests
to allow tests to connect to different addresses. Unfortunately, this
has the potential to introduce races as the address is resolved each
connection attempt. The number of connection attempts can vary based on
when connections are opening and closing. This commit modifies the test
be allowing them to specifically control which address is used.

Related to #50618
2020-01-06 10:20:53 -07:00
Martijn van Groningen 7be43e9f6d
Fix ingest stats test bug. (#50653)
This test code fixes a serialization test bug:
https://gradle-enterprise.elastic.co/s/7x2ct6yywkw3o

Rarely stats for the same processor are generated and
the production code then sums up these stats. However
the test code wasn't summing up in that case,
which caused inconsistencies between the actual and expected results.

Closes #50507
2020-01-06 15:37:47 +01:00
Nhat Nguyen b71490b06b
Deprecate indices without soft-deletes (#50502) (#50634)
Soft-deletes will be enabled for all indices in 8.0. Hence, we should
deprecate new indices without soft-deletes in 7.x.

Backport of #50502
2020-01-06 08:44:30 -05:00
Henning Andersen ec0ec61881 Deleted docs disregarded for if_seq_no check (#50526)
Previously, as long as a deleted version value was kept as a tombstone,
another index or delete operation against the same id would leak that
the doc had existed (through seq_no info) or would allow the operation
if the client forged the seq_no. Fixed to disregard info on deleted docs
when doing seq_no based optimistic concurrency check.
2020-01-06 13:54:36 +01:00
Nikita Glashenko 5533e1172c Add tests for remaining IntervalsSourceProvider implementations (#50326)
This PR adds unit tests for wire and xContent serialization of remaining IntervalsSourceProvider
implementations.

Closes #50150
2020-01-06 13:18:53 +01:00
David Turner 66c690922c Collect shard sizes for closed indices (#50645)
Today the `InternalClusterInfoService` collects information on the sizes of
shards of open indices, but does not consider closed indices. This means that
shards of closed indices are treated as having zero size when they are being
allocated. This commit fixes this, obtaining the sizes of all shards.

Relates #33888
2020-01-06 11:44:19 +00:00
Henning Andersen 312bf44601 Workaround for JDK 14 EA FileChannel.map issue (#50523)
FileChannel.map provokes static initialization of ExtendedMapMode in
JDK14 EA, which needs elevated privileges.

Relates #50512
2020-01-06 12:18:49 +01:00
Nik Everett 2362c430cd
Clean up wire test case a bit (#50627) (#50632)
* Adds JavaDoc to `AbstractWireTestCase` and
`AbstractWireSerializingTestCase` so it is more obvious you should prefer
the latter if you have a choice
* Moves the `instanceReader` method out of `AbstractWireTestCase` becaue
it is no longer used.
* Marks a bunch of methods final so it is more obvious which classes are
for what.
* Cleans up the side effects of the above.
2020-01-05 16:20:38 -05:00
Nik Everett 4d58656065
Declare remaining parsers `final` (#50571) (#50615)
We have about 800 `ObjectParsers` in Elasticsearch, about 700 of which
are final. This is *probably* the right way to declare them because in
practice we never mutate them after they are built. And we certainly
don't change the static reference. Anyway, this adds `final` to these
parsers.

I found the non-final parsers with this:
```
diff \
  <(find . -type f -name '*.java' -exec grep -iHe 'static.*PARSER\s*=' {} \+ | sort) \
  <(find . -type f -name '*.java' -exec grep -iHe 'static.*final.*PARSER\s*=' {} \+ | sort) \
  2>&1 | grep '^<'
```
2020-01-03 11:48:11 -05:00
Andrei Dan 856607b5a6
Guard against null geoBoundingBox (#50506) (#50608)
A geo box with a top value of Double.NEGATIVE_INFINITY will yield an empty
xContent which translates to a null `geoBoundingBox`. This commit marks the
field as `Nullable` and guards against null when retrieving the `topLeft`
and `bottomRight` fields.

Fixes https://github.com/elastic/elasticsearch/issues/50505

(cherry picked from commit 051718f9b1e1ca957229b01e80d7b79d7e727e14)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-03 18:04:26 +02:00
Nik Everett 1abecad21b
Mark some constants in decay functions final (#50569) (#50575)
This marks a couple of constants in the `DecayFunctionBuilder` as final.
They are written in CONSTANT_CASE and used as constants but not final
which is a little confusing and might lead to sneaky bugs.
2020-01-03 10:58:15 -05:00
Henning Andersen 218bd19034
Improve FutureUtils.get exception handling (#50339) (#50417)
FutureUtils.get() would unwrap ElasticsearchWrapperExceptions. This
is trappy, since nearly all usages of FutureUtils.get() expected only to
not have to deal with checked exceptions.

In particular, StepListener builds upon ListenableFuture which uses
FutureUtils.get to be informed about the exception passed to onFailure.
This had the bad consequence of masking away any exception that was an
ElasticsearchWrapperException like RemoteTransportException.
Specifically for recovery, this made CircuitBreakerExceptions happening
on the target node look like they originated from the source node.

The only usage that expected that behaviour was AdapterActionFuture.
The unwrap behaviour has been moved to that class.
2020-01-03 15:28:47 +01:00
kkewwei 5655d6a1c1 Log index name when updating index settings (#49969)
Today we log changes to index settings like this:

    updating [index.setting.blah] from [A] to [B]

The identity of the index whose settings were updated is conspicuously absent
from this message. This commit addresses this by adding the index name to these
messages.

Fixes #49818.
2020-01-03 11:26:29 +00:00
Alan Woodward 8b362c657b Add fuzzy intervals source (#49762)
This intervals source will return terms that are similar to an input term, up to
an edit distance defined by fuzziness, similar to FuzzyQuery.

Closes #49595
2020-01-03 09:59:19 +00:00
Henning Andersen e19585b47f Enhance TransportReplicationAction assertions (#49081)
Include failure into assertion error when replication action discovers
that it has been double triggered.
2020-01-02 19:23:10 +01:00
Oleg 7539fbb30f Deprecate the 'local' parameter of /_cat/nodes (#50499)
The cat nodes API performs a `ClusterStateAction` then a `NodesInfoAction`.
Today it accepts the `?local` parameter and passes this to the
`ClusterStateAction` but this parameter has no effect on the `NodesInfoAction`.
This is surprising, because `GET _cat/nodes?local` looks like it might be a
completely local call but in fact it still depends on every node in the
cluster.

This commit deprecates the `?local` parameter on this API so that it can be
removed in 8.0.

Relates #50088
2020-01-02 14:53:56 +00:00
Nhat Nguyen e7c15a5c6e Ensure relocating shards establish peer recovery retention leases (#50486)
We forgot to establish peer recovery retention leases for relocating primaries
without soft-deletes.

Relates #50351
2019-12-26 13:51:35 -05:00
Nhat Nguyen 7713221733 Fix testCancelRecoveryDuringPhase1 (#50449)
testCancelRecoveryDuringPhase1 uses a mock of IndexShard, which can't
create retention leases. We need to stub method createRetentionLease.

Relates #50351 
Closes #50424
2019-12-26 09:48:58 -05:00
Yannick Welsch f57569bf5c Mute RecoverySourceHandlerTests.testCancelRecoveryDuringPhase1
Relates #50424
2019-12-24 12:13:31 -05:00
Martijn van Groningen 10ed1ae1d2
Add remote info to the HLRC (#50483)
The additional change to the original PR (#49657), is that `org.elasticsearch.client.cluster.RemoteConnectionInfo` now parses the initial_connect_timeout field as a string instead of a TimeValue instance.

The reason that this is needed is because that the initial_connect_timeout field in the remote connection api is serialized for human consumption, but not for parsing purposes.
Therefore the HLRC can't parse it correctly (which caused test failures in CI, but not in the PR CI
:( ). The way this field is serialized needs to be changed in the remote connection api, but that is a breaking change. We should wait making this change until rest api versioning is introduced.

Co-Authored-By: j-bean <anton.shuvaev91@gmail.com>

Co-authored-by: j-bean <anton.shuvaev91@gmail.com>
2019-12-24 15:11:58 +01:00
Nhat Nguyen 33204c2055 Use peer recovery retention leases for indices without soft-deletes (#50351)
Today, the replica allocator uses peer recovery retention leases to
select the best-matched copies when allocating replicas of indices with
soft-deletes. We can employ this mechanism for indices without
soft-deletes because the retaining sequence number of a PRRL is the
persisted global checkpoint (plus one) of that copy. If the primary and
replica have the same retaining sequence number, then we should be able
to perform a noop recovery. The reason is that we must be retaining
translog up to the local checkpoint of the safe commit, which is at most
the global checkpoint of either copy). The only limitation is that we
might not cancel ongoing file-based recoveries with PRRLs for noop
recoveries. We can't make the translog retention policy comply with
PRRLs. We also have this problem with soft-deletes if a PRRL is about to
expire.

Relates #45136
Relates #46959
2019-12-23 22:04:07 -05:00
Tal Levy bed121efaf
[7.x-backport] Centralize BoundingBox logic to a dedicated class (#50469)
Both geo_bounding_box query and geo_bounds aggregation have
a very similar definition of a "bounding box". A lot of this
logic (serialization, xcontent-parsing, etc) can be centralized
instead of having separated efforts to do the same things
2019-12-23 11:21:39 -08:00
Aleksandr Maus d5cec7faa1
Improve SearchHit "equals" implementation for null fields cases (#50327) (#50448)
* Improve SearchHit "equals" implementation for null fields cases
2019-12-23 09:59:07 -05:00
Igor Motov 339d10c16f Geo: Switch generated GeoJson type names to camel case (#50400)
Switches generated GeoJson type names to camel case
to conform to the standard.

Closes #49568
2019-12-20 15:37:22 -05:00
Andrei Dan a3cdbda7c6
Make the TransportRolloverAction execute in one cluster state update (#50388) (#50442)
This commit makes the TransportRolloverAction more resilient, by having it execute
only one cluster state update that creates the new (rollover index), rolls over
the alias from the source to the target index and set the RolloverInfo on the
source index. Before these 3 steps were represented as 3 chained cluster state
updates, which would've seen the user manually intervene if, say, the alias
rollover cluster state update (second in the chain) failed but the creation of
the rollover index (first in the chain) update succeeded

* Rename innerExecute to applyAliasActions

(cherry picked from commit 1ba4339a0c73ef3354b8c8b44b628fc55f1dbc78)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2019-12-20 18:01:03 +00:00
Nhat Nguyen 975cc99516 Close engine before reset log appender (#50390)
Merge threads can run and access the mock appender after we have stopped it.

Closes #50315
2019-12-20 12:19:03 -05:00
Jim Ferenczi 2acafd4b15
Optimize composite aggregation based on index sorting (#48399) (#50272)
Co-authored-by: Daniel Huang <danielhuang@tencent.com>

This is a spinoff of #48130 that generalizes the proposal to allow early termination with the composite aggregation when leading sources match a prefix or the entire index sort specification.
In such case the composite aggregation can use the index sort natural order to early terminate the collection when it reaches a composite key that is greater than the bottom of the queue.
The optimization is also applicable when a query other than match_all is provided. However the optimization is deactivated for sources that match the index sort in the following cases:
  * Multi-valued source, in such case early termination is not possible.
  * missing_bucket is set to true
2019-12-20 12:32:37 +01:00
Yannick Welsch 4f805deb0c Only auto-expand replicas with allocation filtering when all nodes upgraded (#50361)
Follow-up to #48974 that ensures that replicas are only auto-expanded according to allocation
filtering rules once all nodes are upgraded to a version that supports this. Helps with
orchestrating cluster upgrades.
2019-12-20 11:50:00 +01:00
Yannick Welsch 5f37f1f401 Revert "Only auto-expand replicas with allocation filtering when all nodes upgraded (#50361)"
This reverts commit df4fe73b84.
2019-12-20 11:07:30 +01:00
Hendrik Muhs de14092ad2 [Transform] refactor source and dest validation to support CCS (#50018)
refactors source and dest validation, adds support for CCS, makes resolve work like reindex/search, allow aliased dest index with a single write index.

fixes #49988
fixes #49851
relates #43201
2019-12-20 10:49:53 +01:00
Alan Woodward 3cdc23ec9c
Fix meta version of task index mapping (#50363)
The built-in task index mapping has a version field in its metadata, so that
the TaskResultsService can check to see if it needs to update mappings when
a new task result is stored. #48393 updated this version in TaskResultsService
but omitted to change the version in the mapping itself, so a mapping update
is applied every time a new task result is stored.

This commit updates the mapping version so that it corresponds to the version
in TaskResultsService.
2019-12-20 09:44:47 +00:00
Yannick Welsch df4fe73b84 Only auto-expand replicas with allocation filtering when all nodes upgraded (#50361)
Follow-up to #48974 that ensures that replicas are only auto-expanded according to allocation
filtering rules once all nodes are upgraded to a version that supports this. Helps with
orchestrating cluster upgrades.
2019-12-20 10:22:44 +01:00
Tim Brooks cb73fb0f9b
Backport remote proxy mode stats and naming (#50402)
* Update remote cluster stats to support simple mode (#49961)

Remote cluster stats API currently only returns useful information if
the strategy in use is the SNIFF mode. This PR modifies the API to
provide relevant information if the user is in the SIMPLE mode. This
information is the configured addresses, max socket connections, and
open socket connections.

* Send hostname in SNI header in simple remote mode (#50247)

Currently an intermediate proxy must route conncctions to the
appropriate remote cluster when using simple mode. This commit offers
a additional mechanism for the proxy to route the connections by
including the hostname in the TLS SNI header.

* Rename the remote connection mode simple to proxy (#50291)

This commit renames the simple connection mode to the proxy connection
mode for remote cluster connections. In order to do this, the mode specific
settings which we namespaced by their mode (ex: sniff.seed and
proxy.addresses) have been reverted.

* Modify proxy mode to support a single address (#50391)

Currently, the remote proxy connection mode uses a list setting for the
proxy address. This commit modifies this so that the setting is
proxy_address and only supports a single remote proxy address.
2019-12-19 18:02:48 -07:00
Stuart Tettemer 689df1f28f
Scripting: ScriptFactory not required by compile (#50344) (#50392)
Avoid backwards incompatible changes for 8.x and 7.6 by removing type
restriction on compile and Factory.  Factories may optionally implement
ScriptFactory.  If so, then they can indicate determinism and thus
cacheability.

**Backport**

Relates: #49466
2019-12-19 12:50:25 -07:00
Alan Woodward 1a2e931d6e Reduce the max depth of randomly generated interval queries (#50317)
We randomly generate intervals sources to test serialization and query generation
in IntervalQueryBuilderTests. However, rarely we can generate a query that has
too many nested disjunctions, resulting in query rewrites running afoul of the maximum
boolean clause limit.

This commit reduces the maximum depth of the randomly generated intervals source
to make running into this limit much more unlikely.
2019-12-19 15:12:12 +00:00
Andrei Dan 1e11d23051
Extract a create index method that only manipulates the ClusterState (#50240) (#50328)
* Extract IndexCreationTask execute into applyCreateIndexRequest

This is the first step in preparation for separating the index creation into a few
steps that only deal with the cluster state mutation and removing the IndexCreationTask
altogether.

* Split applyCreateIndexRequest

This breaks down the logic in applyCreateIndexRequest into multiple
steps that will hopefully make the service more readable and unit testable.
The service creation process now goes through a few well defined steps,
namely find the templates that possibly match the new index, parse the
requested and template matching mappings, process the index and template
matching settings, validate the wait for active shards request and
create the `IndexService`, update the mappings in the `MapperService` (which
is grouped together with creating the sort order for validation purposes),
validate the requested and templated matching aliases and finally update
the `ClusterState` to reflect the requested changes.

This also removes the `IndexCreationTask` as it was a shallow
indirection and migrates the tests from `IndexCreationTaskTests` to
`MetaDataCreateIndexServiceTests` (making them "real" unit tests
operating on the `ClusterState` rather than mocks).

* Add more unit tests.

* Add IT to verify we cleanup in case of failure

(cherry picked from commit 57e6269f750471f05a1a79539ca45361b9e3c2b5)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>

# Conflicts:
    #       server/src/main/java/org/elasticsearch/cluster/metadata/MetaDataCreateIndexService.java
    #       server/src/test/java/org/elasticsearch/action/admin/indices/create/CreateIndexIT.java
    #       server/src/test/java/org/elasticsearch/cluster/metadata/IndexCreationTaskTests.java
2019-12-19 12:37:07 +00:00
Igor Motov c77ca98928 Geo: Switch generated WKT to upper case (#50285)
Switches generated WKT to upper case to
conform to the standard recommendation.

Relates #49568
2019-12-18 17:29:08 -05:00
Stuart Tettemer 9cdbcbd121
[TEST] Exclude name on ScriptContextInfo mutate (#50332) (#50337)
ScriptContextInfoSerializingTests:testEqualsAndHashcode was failing
because the mutation was generating the same name.

**Backport**

Fixes: #50331
2019-12-18 14:23:21 -07:00
Stuart Tettemer 06a24f09cf
Scripting: Cache script results if deterministic (#50106) (#50329)
Cache results from queries that use scripts if they use only
deterministic API calls.  Nondeterministic API calls are marked in the
whitelist with the `@nondeterministic` annotation.  Examples are
`Math.random()` and `new Date()`.

Refs: #49466
2019-12-18 13:00:42 -07:00
Adrien Grand 35a88a5dbb Add 7.5.2 version. 2019-12-18 19:50:00 +01:00
Ryan Ernst 8439b2779b Add version 6.8.7 constant 2019-12-18 09:38:07 -08:00
Nikita Glashenko ef54a9c23c Add tests for IntervalsSourceProvider.Wildcard and Prefix (#50306)
This PR adds unit tests for wire and xContent serialization of `IntervalsSourceProvider.Wildcard`
and `IntervalsSourceProvider.Prefix`.

Relates #50150
2019-12-18 17:41:48 +01:00
Yannick Welsch 37b8c139b3 Omit loading IndexMetaData when inspecting shards (#50214)
Loading shard state information during shard allocation sometimes runs into a situation where a
data node does not know yet how to look up the shard on disk if custom data paths are used.
The current implementation loads the index metadata from disk to determine what the custom
data path looks like. This PR removes this dependency, simplifying the lookup.

Relates #48701
2019-12-17 14:33:02 +01:00
Martijn van Groningen 2079f1cbeb
Backport: Fix ingest simulate response document order if processor executes async (#50269)
Backport #50244 to 7.x branch.

If a processor executes asynchronously and the ingest simulate api simulates with
multiple documents then the order of the documents in the response may not match
the order of the documents in the request.

Alexander Reelsen discovered this issue with the enrich processor with the following reproduction:

```
PUT cities/_doc/munich
{"zip":"80331","city":"Munich"}

PUT cities/_doc/berlin
{"zip":"10965","city":"Berlin"}

PUT /_enrich/policy/zip-policy
{
  "match": {
    "indices": "cities",
    "match_field": "zip",
    "enrich_fields": [ "city" ]
  }
}

POST /_enrich/policy/zip-policy/_execute

GET _cat/indices/.enrich-*

POST /_ingest/pipeline/_simulate
{
  "pipeline": {
  "processors" : [
    {
      "enrich" : {
        "policy_name": "zip-policy",
        "field" : "zip",
        "target_field": "city",
        "max_matches": "1"
      }
    }
  ]
  },
  "docs": [
    { "_id": "first", "_source" : { "zip" : "80331" } } ,
    { "_id": "second", "_source" : { "zip" : "50667" } }
  ]
}
```

* fixed test compile error
2019-12-17 12:27:07 +01:00
Armin Braun 4f24739fbe
Fix Index Deletion During Partial Snapshot Create (#50234) (#50266)
We can simply filter out shard generation updates for indices
that were removed from the cluster state concurrently to fix
index deletes during partial snapshots as that completely removes
any reference to those shards from the snapshot.

Follow up to #50202
Closes #50200
2019-12-17 10:58:15 +01:00
Armin Braun 2e7b1ab375
Use ClusterState as Consistency Source for Snapshot Repositories (#49060) (#50267)
Follow up to #49729

This change removes falling back to listing out the repository contents to find the latest `index-N` in write-mounted blob store repositories.
This saves 2-3 list operations on each snapshot create and delete operation. Also it makes all the snapshot status APIs cheaper (and faster) by saving one list operation there as well in many cases.
This removes the resiliency to concurrent modifications of the repository as a result and puts a repository in a `corrupted` state in case loading `RepositoryData` failed from the assumed generation.
2019-12-17 10:55:13 +01:00
Henning Andersen 8391b974c5 Recovery buffer size 16B smaller (#50100)
G1GC will use humongous allocations when an allocation exceeds half the
chosen region size, which is minimum 1MB. By reducing the recovery
buffer size by 16 bytes we ensure that the recovery buffer is never
allocated as a humongous allocation.
2019-12-16 22:00:22 +01:00
Nhat Nguyen 731bfa6614 Account trimAboveSeqNo in committed translog generation (#50205)
Today we do not consider trimAboveSeqNo when calculating the translog 
generation of an index commit. If there is no new indexing after the
primary promotion, then we won't be able to clean up the translog.
2019-12-16 11:40:16 -05:00
Zachary Tong be78d5cc74 Migrate MinAggregator integration tests to AggregatorTestCase (#50053)
Also renames MinTests to MinAggregationBuilderTests
2019-12-16 11:15:50 -05:00
Rory Hunter 2bd3a05892
Refactor environment variable processing for Docker (#50221)
Backport of #49612.

The current Docker entrypoint script picks up environment variables and
translates them into -E command line arguments. However, since any tool
executes via `docker exec` doesn't run the entrypoint, it results in
a poorer user experience.

Therefore, refactor the env var handling so that the -E options are
generated in `elasticsearch-env`. These have to be appended to any
existing command arguments, since some CLI tools have subcommands and
-E arguments must come after the subcommand.

Also extract the support for `_FILE` env vars into a separate script, so
that it can be called from more than once place (the behaviour is
idempotent).

Finally, add noop -E handling to CronEvalTool for parity, and support
`-E` in MultiCommand before subcommands.
2019-12-16 15:39:28 +00:00
Armin Braun afcdc27c02
Fix Index Deletion during Snapshot Finalization (#50202) (#50227)
With #45689 making it so that index metadata is written
after all shards have been snapshotted we can't delete indices
that are part of the upcoming snapshot finalization any longer
and it is not sufficient to check if all shards of an index have been
snapshotted before deciding that it is safe to delete it.
This change forbids deleting any index that is in the process of being
snapshot to avoid issues during snapshot finalization.

Relates #50200 (doesn't fully fix yet because we're not fixing the `partial=true`
snapshot case here
2019-12-16 13:30:05 +01:00
Henning Andersen 4ced237a7f Disk threshold decider is enabled by default (#50222)
An old comment had survived after the default was flipped.

Relates #6204
2019-12-16 12:43:34 +01:00
Armin Braun 761d6e8e4b
Remove BlobContainer Tests against Mocks (#50194) (#50220)
* Remove BlobContainer Tests against Mocks

Removing all these weird mocks as asked for by #30424.
All these tests are now part of real repository ITs and otherwise left unchanged if they had
independent tests that didn't call the `createBlobStore` method previously.
The HDFS tests also get added coverage as a side-effect because they did not have an implementation
of the abstract repository ITs.

Closes #30424
2019-12-16 11:37:09 +01:00
Ignacio Vera 3717c733ff
"CONTAINS" support for BKD-backed geo_shape and shape fields (#50141) (#50213)
Lucene 8.4 added support for "CONTAINS", therefore in this commit those
changes are integrated in Elasticsearch. This commit contains as well a
bug fix when querying with a geometry collection with "DISJOINT" relation.
2019-12-16 09:17:51 +01:00
Nhat Nguyen 6f1098cceb Fix version in testTurnOffTranslogRetentionAfterAllShardStarted
Soft-deletes requires 6.5 or later.
2019-12-15 12:58:28 -05:00
Nhat Nguyen df46848fb0 Migrate peer recovery from translog to retention lease (#49448)
Since 7.4, we switch from translog to Lucene as the source of history
for peer recoveries. However, we reduce the likelihood of
operation-based recoveries when performing a full cluster restart from
pre-7.4 because existing copies do not have PPRL.

To remedy this issue, we fallback using translog in peer recoveries if
the recovering replica does not have a peer recovery retention lease,
and the replication group hasn't fully migrated to PRRL.

Relates #45136
2019-12-15 10:24:39 -05:00
Nhat Nguyen c151a75dfe Use retention lease in peer recovery of closed indices (#48430)
Today we do not use retention leases in peer recovery for closed indices
because we can't sync retention leases on closed indices. This change
allows that ability and adjusts peer recovery to use retention leases
for all indices with soft-deletes enabled.

Relates #45136

Co-authored-by: David Turner <david.turner@elastic.co>
2019-12-15 10:24:34 -05:00
Christoph Büscher c0216f9a06 Improve DateFieldMapper `ignore_malformed` handling (#50090)
A recent change around date parsing (#46675) made it stricter, so we should now also
catch DateTimeExceptions in DateFieldMapper and ignore those when the
`ignore_malformed` option is set.

Closes #50081
2019-12-13 10:00:10 +01:00
Zachary Tong 521933aa11 SingleBucket aggs need to reduce their bucket's pipelines first (#50103)
When decoupling the pipeline reduction from regular agg reduction,
MultiBucket aggs were modified to reduce their bucket's pipeline
aggs first before reducing the sibling aggs.  This modification
was missed on SingleBucket aggs, meaning any SingleBucket would
fail to reduce any pipeline sub-aggs
2019-12-12 09:07:33 -05:00
Ignacio Vera b5ec227de8
upgrade to lucene 8.4.0-snapshot-08b8d116f8f (#50129) (#50132) 2019-12-12 13:13:37 +01:00
Adrien Grand 0bba7ccedd
Remove information about the latest PostingsFormat/DocValuesFormat. (#50118) (#50127)
This information is outdated and unused.
2019-12-12 11:46:37 +01:00
Armin Braun 6eee41e253
Remove Unused Single Delete in BlobStoreRepository (#50024) (#50123)
* Remove Unused Single Delete in BlobStoreRepository

There are no more production uses of the non-bulk delete or the delete that throws
on missing so this commit removes both these methods.
Only the bulk delete logic remains. Where the bulk delete was derived from single deletes,
the single delete code was inlined into the bulk delete method.
Where single delete was used in tests it was replaced by bulk deleting.
2019-12-12 11:17:46 +01:00
Armin Braun 0fae4065ef
Better Logging GCS Blobstore Mock (#50102) (#50124)
* Better Logging GCS Blobstore Mock

Two things:
1. We should just throw a descriptive assertion error and figure out why we're not reading a multi-part instead of
returning a `400` and failing the tests that way here since we can't reproduce these 400s locally.
2. We were missing logging the exception on a cleanup delete failure that coincides with the `400` issue in tests.

Relates #49429
2019-12-12 11:17:22 +01:00
Ryan Ernst cbff63685a Ensure meta and document field maps are never null in GetResult (#50112)
This commit ensures deseriable a GetResult from StreamInput does not
leave metaFields and documentFields null. This could cause an NPE in
situations where upsert response for a document that did not exist is
passed back to a node that forwarded the upsert request.

closes #48215
2019-12-11 22:21:55 -08:00