Commit Graph

43947 Commits

Author SHA1 Message Date
Julie Tibshirani f0fc6e8003
Make sure PutMappingRequest accepts content types other than JSON. (#37720) 2019-01-23 08:51:05 -08:00
Lee Hinman 647e225698
Retry ILM steps that fail due to SnapshotInProgressException (#37624)
Some steps, such as steps that delete, close, or freeze an index, may fail due to a currently running snapshot of the index. In those cases, rather than move to the ERROR step, we should retry the step when the snapshot has completed.

This change adds an abstract step (`AsyncRetryDuringSnapshotActionStep`) that certain steps (like the ones I mentioned above) can extend that will automatically handle a situation where a snapshot is taking place. When a `SnapshotInProgressException` is received by the listener wrapper, a `ClusterStateObserver` listener is registered to wait until the snapshot has completed, re-running the ILM action when no snapshot is occurring.

This also adds integration tests for these scenarios (thanks to @talevy in #37552).

Resolves #37541
2019-01-23 09:46:31 -07:00
David Kyle d193ca8aae
Use disassociate in preference to deassociate (#37704) 2019-01-23 16:06:25 +00:00
Armin Braun 2439f68745
Delete Redundant RoutingServiceTests (#37750)
* This test compleletly overrode the `reroute` method and hence did nothing put test the override itself
   * Removed the test since it tests nothing and simplified `reroute` accordingly
2019-01-23 16:39:02 +01:00
Nhat Nguyen 6a9838359c
Always return metadata version if metadata is requested (#37674)
If the indices of a ClusterStateRequest are specified, we fail to
include the cluster state metadata version in the response.

Relates  #37633
2019-01-23 10:24:51 -05:00
David Roberts 6a5d9d942a [TEST] Mute MlMappingsUpgradeIT testMappingsUpgrade
Due to https://github.com/elastic/elasticsearch/issues/37763
2019-01-23 13:50:31 +00:00
Luca Cavanna 12f5b02fd0
Streamline skip_unavailable handling (#37672)
This commit moves the collectSearchShards method out of RemoteClusterService into TransportSearchAction that currently calls it. RemoteClusterService used to be used only for cross-cluster search but is now also used in cross-cluster replication where different API are called through the RemoteClusterAwareClient. There is no reason for the collectSearchShards and fetchShards methods to be respectively in RemoteClusterService and RemoteClusterConnection. The search shards API can be called through the RemoteClusterAwareClient too, the only missing bit is a way to handle failures based on the skip_unavailable setting for each cluster (currently only supported in RemoteClusterConnection#fetchShards) which is achieved by adding a isSkipUnavailable(String clusterAlias) method to RemoteClusterService.
This change is useful for #32125 as we will very soon need to also call the search API against remote clusters, which will be done through RemoteClusterAwareClient. In that case we will also need to support skip_unavailable when calling the search API so we need some way to handle the skip_unavailable setting like we currently do for the search_shards call.

Relates to #32125
2019-01-23 13:53:37 +01:00
Yannick Welsch d5139e0590
Only bootstrap and elect node in current voting configuration (#37712)
Adapts bootstrapping and leader election to only trigger on nodes that are actually part of the voting
configuration.
2019-01-23 13:10:11 +01:00
Simon Willnauer 4ec3a6d922
Ensure either success or failure path for SearchOperationListener is called (#37467)
Today we have several implementations of executing SearchOperationListener
in SearchService. While all of them seem to be safe at least on, the one that
executes scroll searches can cause illegal execution of SearchOperationListener
that can then in-turn trigger assertions in ShardSearchStats. This change
adds a SearchOperationListenerExecutor that uses try-with blocks to ensure
listeners are called in a safe way.

Relates to #37185
2019-01-23 12:38:44 +01:00
Daniel Mitterdorfer 100537fbc3
Target only specific index in update settings test
With this commit we limit the update settings request to the index that
is used in `IndicesClientIT#testIndexPutSettings()`. This avoids
spurious exceptions involving other indices that might also be present
in the test cluster.

Closes #36931
Relates #37338
2019-01-23 12:19:54 +01:00
Daniel Mitterdorfer 1de286bfb3
Add a note how to benchmark Elasticsearch
With this commit we add a note that explains when to use benchmarks and
point the reader to the macrobenchmarking tool Rally for further
information.

Relates #37694
2019-01-23 12:15:00 +01:00
Alpar Torok f54a3b5f67
Don't use Groovy's `withDefault` (#37726)
Closes #37061
2019-01-23 12:09:40 +02:00
Tanguy Leroux 6130d15172
Adapt SyncedFlushService (#37691) 2019-01-23 11:08:54 +01:00
Alexander Reelsen 701d89caa2 Mute FilterAggregatorTests#testRandom
Relates #37743
2019-01-23 11:00:37 +01:00
Alexander Reelsen daa2ec8a60
Switch mapping/aggregations over to java time (#36363)
This commit moves the aggregation and mapping code from joda time to
java time. This includes field mappers, root object mappers, aggregations with date
histograms, query builders and a lot of changes within tests.

The cut-over to java time is a requirement so that we can support nanoseconds
properly in a future field mapper.

Relates #27330
2019-01-23 10:40:05 +01:00
David Roberts 7b3dd3022d
[ML] Update ML results mappings on process start (#37706)
This change moves the update to the results index mappings
from the open job action to the code that starts the
autodetect process.

When a rolling upgrade is performed we need to update the
mappings for already-open jobs that are reassigned from an
old version node to a new version node, but the open job
action is not called in this case.

Closes #37607
2019-01-23 09:37:37 +00:00
Christoph Büscher b3f9becf5f
Modify removal_of_types.asciidoc (#37648)
After switching the default behaviour of "include_type_name" to "false" in 7.0,
some parts of the types removal documentation can be adapted as well.
2019-01-23 09:48:00 +01:00
Christoph Büscher 6926a73d61
Fix edge case in PutMappingRequestTests (#37665)
The recently introduced client side PutMappingRequestTests has an edge case that
leads to failing tests. When the "source" or the original request is `null` it
gets rendered to xContent as an empty object, which by the test class itself is
parsed back as an empty map. For this reason the original and the parsed request
differ (one has an empty source, one an empty map). This fixes this edge case by
assuming an empty map means a null source in the request. In practice the
distinction doesn't matter because all the client side request does is write
itself to xContent, which gives the same result regardless of whether `source`
is null or empty.

Closes #37654
2019-01-23 09:46:49 +01:00
Christoph Büscher 95a6951f78
Use new bulk API endpoint in the docs (#37698)
This change switches to using the typeless bulk API endpoint in the
documentation snippets where possible
2019-01-23 09:46:28 +01:00
Boaz Leskes 52ba407931
Expose sequence number and primary terms in search responses (#37639)
Users may require the sequence number and primary terms to perform optimistic concurrency control operations. Currently, you can get the sequence number via the `docvalues_fields` API but the primary term is not accessible because it is maintained by the `SeqNoFieldMapper` and the infrastructure can't find it. 

This commit adds a dedicated sub fetch phase to return both numbers that is connected to a new `seq_no_primary_term` parameter.
2019-01-23 09:01:58 +01:00
Andrey Ershov 534ba1dd34
Remove LicenseServiceClusterNotRecoveredTests (#37528)
While tests migration from Zen1 to Zen2, we've encountered this test.
This test is organized as follows:

Starts the first cluster node.
Starts the second cluster node.
Checks that license is active.
Interesting fact that adding assertLicenseActive(true) between 1
and 2 also makes the test pass.
assertLicenseActive retrieves XPackLicenseState from the nodes
and checks that active flag is set. It's set to true even before
the cluster is initialized.

So this test does not make sense.
2019-01-23 07:23:06 +01:00
Andrey Ershov 7c6566e14c
Migrate SpecificMasterNodesIT to Zen2 (#37532)
1. testSimpleOnlyMasterNodeElection - requires cluster bootstrap when
the first master node is started.
2. testElectOnlyBetweenMasterNodes - requires cluster bootstrap when
the first master node is started and requires adding voting exclusion
before shutting down the first master node.
3. testAliasFilterValidation - requires cluster bootstrap when the
first master node is started.
2019-01-23 07:22:41 +01:00
Andrey Ershov e2e00cd245
Fix MetaStateFormat tests
It's not safe to continue writing state using MetaDataStateFormat
after dirty WriteStateException occurred if it's not recovered by
successful subsequent state write.

We've encountered test failure of testFailRandomlyAndReadAnyState.
The test breaks in the following way. There are 3 state paths. And what
happens next

Successful write at the beginning of the test yields 0 0 0 state
files in the directories.
1st write in the loop is unsuccessful, but not dirty - 0 0 0.
2nd write in the loop is not successful and dirty (failure during
fsync), however before removing new files we have 1 1 1. But now during
deletion, the first deletion fails and we get - 1 0 0.
3rd write in the loop is unsuccessful, but not dirty - so we want to
keep old generation, which happens to be the 1st generation, so now we
have 1 x x in state folders. Now we assert that we either load 0 or 1
state from the state folders and select only 2rd and 3th folder to
emulate disk failures - this results in NPE because there is nothing in
these folders.
Fortunately, this won’t be a problem in real life, because if there is
a dirty exception, we shut down the node and make sure we perform a
successful write on the node startup.
2019-01-23 07:21:26 +01:00
Mayya Sharipova 942fc13af5
Use plain text instead of latexmath
As latexmath is not rendered, using plain text instead

Closes #37718
2019-01-22 16:49:03 -05:00
olcbean c28479819e Fix a typo in a warning message in TestFixturesPlugin (#37631) 2019-01-22 22:36:42 +02:00
Julie Tibshirani 992bfd2064
Remove additional references to 'type' from the _bulk documentation. (#37722) 2019-01-22 12:13:01 -08:00
Brandon Kobel 940f6ba4c1
Remove kibana_user and kibana_dashboard_only_user index privileges (#37441)
* Remove kibana_user and kibana_dashboard_only_user .kibana* index privileges

* Removing unused imports
2019-01-22 12:09:08 -08:00
Tim Brooks eb43ab6d60
Implement leader rate limiting for file restore (#37677)
This is related to #35975. This commit implements rate limiting on the
leader side using the CombinedRateLimiter.
2019-01-22 10:57:37 -07:00
Zachary Tong 2ba9e361ab
Add helper classes to determine if aggs have a value (#36020)
This adds a set of helper classes to determine if an agg "has a value". 
This is needed because InternalAggs represent "empty" in different 
manners according to convention. Some use `NaN`, `+/- Inf`, `0.0`, etc.

A user can pass the Internal agg type to one of these helper methods
and it will report if the agg contains a value or not, which allows the
user to differentiate "empty" from a real `NaN`.

These helpers are best-effort in some cases.  For example, several
pipeline aggs share a single return class but use different conventions
to mark "empty", so the helper uses the loosest definition that applies
to all the aggs that use the class.

Sums in particular are unreliable.  The InternalSum simply returns 0.0
if the agg is empty (which is correct, no values == sum of zero).  But this
also means the helper cannot differentiate from "empty" and `+1 + -1`.
2019-01-22 12:38:55 -05:00
Jason Tedor 715719ee3b
Remove warn-date from warning headers (#37622)
This commit removes the warn-date from warning headers. Previously we
were stamping every warning header with when the request
occurred. However, this has a severe performance penalty when
deprecation logging is called frequently, as obtaining the current time
and formatting it properly is expensive. A previous change moved to
using the startup time as the time to stamp on every warning header, but
this was only to prove that the timestamping was expensive. Since the
warn-date is optional, we elect to remove it from the warning
header. Prior to this commit, we worked in Kibana to make the warn-date
treated as optional there so that we can follow-up in Elasticsearch and
remove the warn-date. This commit does that.
2019-01-22 12:29:24 -05:00
Christoph Büscher 256e01ca92
Fix potential NPE in UsersTool (#37660)
It looks like the output of FileUserPasswdStore.parseFile shouldn't be wrapped 
into another map since its output can be null. Doing this wrapping after the null
check (which potentially raises an exception) instead.
2019-01-22 17:34:13 +01:00
Ioannis Kakavas 5c1a1f7ac1
Use PEM files for PkiOptionalClientAuthTests (#37683)
Use PEM files for the key/cert for TLS on the http layer of the
node instead of a JKS keystore so that the tests can also run
in a FIPS 140 JVM .

Resolves: #37682
2019-01-22 17:26:36 +02:00
Alpar Torok 3f2723366e Mute failing test
Tracking #37708
2019-01-22 17:16:40 +02:00
Christoph Büscher 34f2d2ec91
Remove remaining occurances of "include_type_name=true" in docs (#37646) 2019-01-22 15:13:52 +01:00
Andrei Stefan 7507af29fa
SQL: Return Intervals in SQL format for CLI (#37602)
* Add separate CLI Mode
* Use the correct Mode for cursor close requests
* Renamed CliFormatter and have different formatting behavior for CLI and "text" format.
2019-01-22 14:55:28 +02:00
Yannick Welsch 23ba900840
Publish to masters first (#37673)
Prefer publishing to master-eligible nodes first, so that cluster state updates are committed more
quickly, and master-eligible nodes also turned more quickly into followers after a leader election.
2019-01-22 13:53:10 +01:00
David Kyle 3fad1eeaed
Un-assign persistent tasks as nodes exit the cluster (#37656)
PersistentTasksClusterService decides if a task should be reassigned by 
checking there is a node in the cluster with the same Id. If a node is 
restarted PersistentTasksClusterService may not observe the change and 
decide the task still has a valid assignment because the node's 
ephemeral Id is not used in that decision. This change un-assigns tasks
as the nodes in the cluster change.
2019-01-22 12:44:45 +00:00
Henning Andersen 228611843c
Fail start of non-data node if node has data (#37347)
* Fail start of non-data node if node has data

Check that nodes started with node.data=false cannot start if they have
shard data to avoid (old) indexes being resurrected into the cluster in red status.

Issue #27073
2019-01-22 13:27:12 +01:00
Yannick Welsch 2a7b7ccf1c
Use cancel instead of timeout for aborting publications (#37670)
When publications were cancelled because a node turned to follower or candidate, it would still
show as time out, which can be confusing in the logs. This change adapts the improper call of
onTimeout by generalizing it to a cancel method.
2019-01-22 12:51:03 +01:00
Martijn van Groningen ef2f5e4a13
Follow stats api should return a 404 when requesting stats for a non existing index (#37220)
Currently it returns an empty response with a 200 response code.

Closes #37021
2019-01-22 12:48:05 +01:00
Christoph Büscher 0a93a0358b
Remove deprecated FieldNamesFieldMapper.Builder#index (#37305)
The method calls "enabled" in addition to what the super.index() does, but this
seems to be done explicitely now in the TypeParsers `parse` method. The removed
method has been deprecated since at least 6.0. Also making some of the Builders
methods and ctos private since they are only used internally in this class.
2019-01-22 12:12:21 +01:00
Daniel Mitterdorfer 757932a975
Document that date math is locale independent
With this commit we add a note to the API conventions documentation that
all date math expressions are resolved independently of any locale. This
behavior might be puzzling to users that try to specify a different
calendar than a Gregorian calendar.

Closes #37330 
Relates #37663
2019-01-22 12:04:29 +01:00
David Turner 5db7ed22a0
Bootstrap a Zen2 cluster once quorum is discovered (#37463)
Today when bootstrapping a Zen2 cluster we wait for every node in the
`initial_master_nodes` setting to be discovered, so that we can map the
node names or addresses in the `initial_master_nodes` list to their IDs for
inclusion in the initial voting configuration. This means that if any of
the expected master-eligible nodes fails to start then bootstrapping will
not occur and the cluster will not form. This is not ideal, and we would
prefer the cluster to bootstrap even if some of the master-eligible nodes
do not start.

Safe bootstrapping requires that all pairs of quorums of all initial
configurations overlap, and this is particularly troublesome to ensure
given that nodes may be concurrently and independently attempting to
bootstrap the cluster. The solution is to bootstrap using an initial
configuration whose size matches the size of the expected set of
master-eligible nodes, but with the unknown IDs replaced by "placeholder"
IDs that can never belong to any node.  Any quorum of received votes in any
of these placeholder-laden initial configurations is also a quorum of the
"true" initial set of master-eligible nodes, giving the guarantee that it
intersects all other quorums as required.

Note that this change means that the initial configuration is not
necessarily robust to any node failures. Normally the cluster will form and
then auto-reconfigure to a more robust configuration in which the
placeholder IDs are replaced by the IDs of genuine nodes as they join the
cluster; however if a node fails between bootstrapping and this
auto-reconfiguration then the cluster may become unavailable. This we feel
to be less likely than a node failing to start at all.

This commit also enormously simplifies the cluster bootstrapping process.
Today, the cluster bootstrapping process involves two (local) transport actions
in order to support a flexible bootstrapping API and to make it easily
accessible to plugins. However this flexibility is not required for the current
design so it is adding a good deal of unnecessary complexity. Here we remove
this complexity in favour of a much simpler ClusterBootstrapService
implementation that does all the work itself.
2019-01-22 11:03:51 +00:00
Adrien Grand e9fcb25a28
Upgrade to lucene-8.0.0-snapshot-83f9835. (#37668)
This snapshot uses a new file format for doc-values which is expected to make
advance/advanceExact perform faster on sparse fields:
https://issues.apache.org/jira/browse/LUCENE-8585
2019-01-22 11:44:29 +01:00
Alpar Torok 74d1cfbf7e Mute failing test
Tracking ##37687
2019-01-22 10:50:27 +02:00
Alexander Reelsen 4fb68ea195
Fix java time formatters that round up (#37604)
In order to be able to parse epoch seconds and epoch milli seconds own
java time fields had been introduced. These fields are however not
compatible with the way that java time allows one to configure default
fields (when a part of a timestamp cannot be read then a default value
is added), which is used for the formatters that are rounding up to the
next value.

This commit allows java date formatters to configure its round up parsing 
by setting default values via a consumer. By default all formats are setting 
JavaDateFormatter.ROUND_UP_BASE_FIELDS for rounding up. The epoch
however parsers both need to set different fields. The merged date
formatters do not set any fields, they just append all the round up formatters.

Also the formatter now properly copies the locale and the timezone, 
fractional parsing has been set to nano seconds with proper width.
2019-01-22 09:42:17 +01:00
Yogesh Gaikwad 3e1e1b0b37
Removes awaits fix as the fix is in. (#37676)
The PR for the fix has been merged.
https://github.com/elastic/elasticsearch/pull/37661
but the awaits fix annotation was not removed.
2019-01-22 19:35:17 +11:00
Alpar Torok 17d704347e Mute failing test
Tracking #37685
2019-01-22 10:31:23 +02:00
Tanguy Leroux 0290547ad7
Ensure that max seq # is equal to the global checkpoint when creating ReadOnlyEngines (#37426)
Since version 6.7.0 the Close Index API guarantees that all translog 
operations have been correctly flushed before the index is closed. If 
the index is reopened as a Frozen index (which uses a ReadOnlyEngine) 
we can verify that the maximum sequence number from the last Lucene 
commit is indeed equal to the last known global checkpoint and refuses 
to open the read only engine if it's not the case. In this PR the check is 
only done for indices created on or after 6.7.0 as they are guaranteed 
to be closed using the new Close Index API.

Related #33888
2019-01-22 09:22:33 +01:00
Alpar Torok a713183cab Mute failing discovery disruption tests
Tracking #37539
2019-01-22 10:16:04 +02:00