Commit Graph

42990 Commits

Author SHA1 Message Date
Yannick Welsch 9026f98aca Remove trace logging from CoordinatorTests 2018-12-04 18:53:32 +01:00
David Turner 7b82c6c4cc Override gateway.recover_after_master_nodes in test
By default gateway.recover_after_master_nodes is set to
discovery.zen.minimum_master_nodes but in this Zen2 test this is set to an
unreasonably large value. This change updates it so the cluster can properly
form.
2018-12-04 16:59:08 +00:00
Andrey Ershov 35e3d77e2c
[Zen2] Implement state recovery (#36013)
This commit implements proper metadata recovery for Zen2.

GatewayService is responsible for the recovery. In Zen1 GatewayService
creates an instance of Gateway, that is used to reach out to other cluster
nodes, get their state and calculate the most up-to-date state based on
versions. After that Gateway performs upgrade and archival of
 ClusterSettings and closes bad indices. Then recovered state is passed to GatewayService.GatewayRecoveryListener that mixes up current state
and restored state, removes state not recovered block, creates the
routing table and performs re-routing.

In Zen2 we should perform this kind of logic on cluster startup, except
mixing state (because there is nothing to mix) and opening routing table.

This commit refactors out all `ClusterUpdate` functions in a separate class
`ClusterStateUpdaters`, which is used by `Gateway` and `GatewayService`
in case of Zen1, and by `GatewayMetaState` and `GatewayService` in case of
Zen2.

This commit also switches all integration tests that are already using Zen2 from
InMemoryPersistedState to GatewayMetaState.
2018-12-04 14:45:45 +01:00
Yannick Welsch 94010d33c9 Reenable BWC tests 2018-12-04 09:38:18 +01:00
Yannick Welsch 80ee7943c9 Merge remote-tracking branch 'elastic/master' into zen2 2018-12-04 09:37:09 +01:00
David Turner 034c7655b7
[Zen2] Reduce cluster scope in NodeDisconnectIT (#36168)
This test suite can stop all the shared master-eligible nodes, which breaks the
cluster since any non-shared master-eligible nodes are stopped first in the
reset process between tests.

Since this test suite can leave the cluster in this somewhat broken state, it
seems best that it uses a new cluster for each test.
2018-12-04 07:48:56 +00:00
David Turner c01aecb4b1
[Zen2] Do not probe non-master nodes back (#36160)
Today if a node `A` sends a peers request to another node `B` then `B` will
react by sending a peers request back to `A`. However if `A` is not
master-eligible then this reaction is pointless and fails with an exception
saying `non-master-eligible node found`, adding noise to the logs. This change
suppresses this response to non-master-eligible nodes.
2018-12-03 17:19:18 +00:00
Jake Landis f8f521bad4
Deprecate /_xpack/monitoring/* in favor of /_monitoring/* (#36130)
This commit is part of our plan to deprecate and ultimately remove the
use of _xpack in the REST APIs.

* Add deprecation for /_xpack/monitoring/_bulk in favor of /_monitoring/bulk
* Removed xpack from the rest-api-spec and tests
* Removed xpack from the Action name
* Removed MonitoringRestHandler as an unnecessary abstraction
* Minor corrections to comments

Relates #35958
2018-12-03 10:26:08 -06:00
Armin Braun 433a506d06
SNAPSHOT: Improve Resilience SnapshotShardService (#36113)
* Resolve the index in the snapshotting thread
* Added test for routing table - snapshot state mismatch
2018-12-03 16:39:29 +01:00
Nhat Nguyen 9c1c46a02f TEST: Adjust min_retained_seq_no expectation
min_retained_seq_no is non-negative, however, if the number of retained
operations is greater than 0, then the expectation may be negative.
2018-12-03 08:59:13 -05:00
Luca Cavanna b5cae0af58
Enforce max_buckets limit only in the final reduction phase (#36152)
Given that we check the max buckets limit on each shard when collecting the buckets, and that non final reduction cannot add buckets (see #35921), there is no point in counting and checking the number of buckets as part of non final reduction phases.

Such check is still needed though in the final reduction phases to make sure that the number of returned buckets is not above the allowed threshold.

Relates somehow to #32125 as we will make use of non final reduction phases in CCS alternate execution mode and that increases the chance that this check trips for nothing when reducing aggs in each remote cluster.
2018-12-03 13:55:18 +01:00
Boaz Leskes 36ddca7d0c Disable merges in testReuseInFileBasedPeerRecovery
The test assumes lucene files don't change.

Closes #35772
2018-12-03 13:45:19 +01:00
Jim Ferenczi 74aca756b8
Remove the distinction between query and filter context in QueryBuilders (#35354)
When building a query Lucene distinguishes two cases, queries that require to produce a score and queries that only need to match. We cloned this mechanism in the QueryBuilders in order to be able to produce different queries based on whether they need to produce a score or not. However the only case in es that require this distinction is the BoolQueryBuilder that sets a different minimum_should_match when a `bool` query is built in a filter context..
This behavior doesn't seem right because it makes the matching of `should` clauses different when the score is not required.

Closes #35293
2018-12-03 11:49:11 +01:00
Armin Braun 328d022ddd
MINOR: Some Cleanups around Store (#36139)
* Moved method `canOpenIndex` is only used in tests -> moved to test CP
* Simplify `org.elasticsearch.index.store.Store#renameTempFilesSafe`
* Delete some dead methods
2018-12-03 11:21:42 +01:00
Armin Braun f763037b03
MINOR: BlobstoreRepository Cleanups (#36140)
* Removed redundant private getter
* Removed unused `version` field
2018-12-03 11:11:10 +01:00
Armin Braun 9c49aacbcf
MINOR: Remove Dead Code in QueryCache (#36147) 2018-12-03 10:02:35 +01:00
Alpar Torok fa4d5f844d
Fix test fixtures on aufs (#36105)
Closes #36073

The problem showed up on debian 8 which uses aufs docker storage
driver by default as opposed to overlay2 used on other distros.
aufs does not support acls and thus the failure.
The --use-ntvfs option instructs samba not to rely on acls.
From what I can tell this is an implementation detail that should not
affect the tests ( which continue to pass )
2018-12-03 11:01:05 +02:00
Dimitrios Liappis 6a773d7d51
Fix error message when package install fails due to missing Java (#36077)
Currently is `java` is not in $PATH the preinst script fails
prematurely and prevents an appropriate message from getting displayed
to the user.

Make package installation more user friendly when java is not in
$PATH and add a test for it.

Also use a she-bang in the preinst script, as, at least in Debian,
maintainer scripts must start with the #! convention [1].

Relates #31845

[1] https://www.debian.org/doc/debian-policy/ch-maintainerscripts.html
2018-12-03 10:43:36 +02:00
Martijn van Groningen 43773a32a4
Replace Streamable w/ Writeable in BaseTasksRequest and subclasses (#35854)
* Replace Streamable w/ Writeable in BaseTasksRequest and subclasses

This commit replaces usages of Streamable with Writeable for the
BaseTasksRequest / TransportTasksAction classes and subclasses of
these classes.

Relates to #34389
2018-12-03 08:04:29 +01:00
Armin Braun 9c0a429709
TESTS: Fix IndexStatsIT#testFilterCacheStats (#36143)
* Test randomly failed because of background merges
   * Fixed by force merging down to a single segment
* Closes #32506
2018-12-03 06:16:12 +01:00
Tim Vernum d20bb3789d
Add DEBUG/TRACE logs for LDAP bind (#36028)
Introduces a debug log message when a bind fails and a trace message
when a bind succeeds.

It may seem strange to only debug a bind failure, but failures of this
nature are relatively common in some realm configurations (e.g. LDAP
realm with multiple user templates, or additional realms configured
after an LDAP realm).
2018-12-03 10:05:57 +11:00
David Turner 8011438ea8 Use correct source of randomness
This fixes a failure of InternalTestClusterTests#testBeforeTest which checks
that the cluster is set up the same when starting from the same seed. Trappily,
using ESTestCase#randomIntBetween() is no good, we have to use
InternalTestCluster#random via RandomNumbers#randomIntBetween() instead.
2018-12-02 09:39:43 +00:00
David Turner 8bb1952975
Fix NodeJoinTests again (#36133)
In #36033 we removed a catch block because we thought we were preventing
exceptions by avoiding concurrent elections, missing the obvious fact that some
joins are supposed to be failing.

As a quick fix the catch was reinstated in 3a5dab6d8e
but this change adds finesse by only catching exceptions from the joins that we
expect to fail. It also inlines an always-false parameter to `initialState()`.
2018-12-01 09:54:01 +00:00
David Turner 9cc416bc46 Weaken assertion in PeerFinder
It can be inactive with no leader if it's handling an incoming PeersRequest
before being activated for the first time.
2018-12-01 07:20:19 +00:00
David Turner 3a5dab6d8e Reinstate catch removed in error in #36033 2018-12-01 07:10:19 +00:00
David Turner 8191348d6b
[Zen2] Only bootstrap a single node (#36119)
Today, we allow all nodes in an integration test to bootstrap. However this
seems to lead to test failures due to post-election instability. The change
avoids this instability by only bootstrapping a single node in the cluster.
2018-12-01 06:43:11 +00:00
Lisa Cawley 46962308aa
[DOCS] Replace deprecated ldap setting (#36022) 2018-11-30 16:58:19 -08:00
Julie Tibshirani 0e1ddfd825
Deprecate types in document delete requests. (#36087)
* Make sure to use _doc as a type name in the CRUD HLRC tests.
* Deprecate types in document delete requests.
2018-11-30 15:11:29 -08:00
Jay Modi 7b999bdc88
Fix logic in dockerComposeSupported (#36125)
The logic in the dockerComposeSupported method currently returns false
even when docker and docker compose are available on the build machine.
This change updates the check to see if docker compose is available in
one of the two paths and allows the `tests.fixture.enabled` property to
disable the tests even if docker compose is available.
2018-11-30 14:38:10 -07:00
Julie Tibshirani 98b290637d
Deprecate the _termvector endpoint. (#36098) 2018-11-30 13:11:58 -08:00
Nik Everett df56f0734e
Tasks: Retry if task can't be written (#35054)
Adds about a minute worth of backoffs and retries to saving task
results so it is *much* more likely that a busy cluster won't lose task
results. This isn't an ideal solution to losing task results, but it is
an incremental improvement. If all of the retries fail when still log
the task result, but that is far from ideal.

Closes #33764
2018-11-30 16:06:58 -05:00
Gordon Brown 3c4953f4d1
State default shard limit is not a recommendation (#36093)
The new limit on the number of open shards in a cluster may be
interpreted by users as a sizing recommendation, but it is not. This
clarifies in the documentation that this is a safety limit, not a
recommendation.
2018-11-30 13:05:14 -07:00
Luca Cavanna 0ebc17743a
Histogram aggs: add empty buckets only in the final reduce step (#35921)
Empty buckets don't need to be added when performing an incremental reduction step, they can be added later in the final reduction step. This will allow us to later remove the max buckets limit when performing non final reduction.
2018-11-30 20:33:09 +01:00
Gordon Brown d7652963b1
Add note about ILM and Snapshots (#36023)
This commit documents how Index Lifecycle Management
interacts with snapshot/restore, and documents a workaround
for situations in which ILM should not immediately resume
managing an index after it is restored.
2018-11-30 12:06:48 -07:00
Tim Brooks ea7ea51050
Make `TcpTransport#openConnection` fully async (#36095)
This is a follow-up to #35144. That commit made the underlying
connection opening process in TcpTransport asynchronous. However the
method still blocked on the process being complete before returning.
This commit moves the blocking to the ConnectionManager level. This is
another step towards the top-level TransportService api being async.
2018-11-30 11:30:42 -07:00
Chris Koehnke 465a65aa57
Docs: Fix release-state check for oss repositories (#36120)
To get the newly added oss apt/yum sections to get rendered for
`released` and `prerelease` versions the condition needs to be modified.
2018-11-30 13:17:39 -05:00
Armin Braun 986bf52d1f
[Zen2] Allow Setting a List of Bootstrap Nodes to Wait for (#35847) 2018-11-30 18:53:08 +01:00
Jim Ferenczi e179fd1274
Add support for rest_total_hits_as_int in watcher (#36035)
This change adds the support for rest_total_hits_as_int
in the watcher search inputs. Setting this parameter in the request
will transform the search response to contain the total hits as
a number (instead of an object).
Note that this parameter is currently a noop since #35849 is not
merged.

Closes #36008
2018-11-30 18:02:37 +01:00
Lisa Cawley c24be278e4
[DOCS] Refreshes population job examples (#36101) 2018-11-30 08:55:29 -08:00
Jim Ferenczi 54facbe325 [TEST] fix typo in get-watch documentation (bis) 2018-11-30 17:46:51 +01:00
Tim Brooks 26dcbcc8cc
Remove `MockTcpTransport` for ESIntegTestCase (#36089)
This commit removes the `MockTcpTransport` as a transport option for
`ESIntegTestCase`. It is the first step in replacing the usages of
`MockTcpTransport` with `MockNioTransport`.
2018-11-30 09:04:51 -07:00
Tim Brooks da100c5479
Remove `Lifecycle` from `ConnectionManager` (#36092)
Prior to #35441 `ConnectionManager` had a `Lifecycle` object to support
the ping runnable. After that commit, the connection amanger only needs
the existing `AtomicBoolean` to indicate if it is running.
2018-11-30 09:04:32 -07:00
Tim Brooks 370472b6d1
Upgrade Netty 4.3.32.Final (#36102)
This commit upgrades netty. This will close #35360. Netty started
throwing an IllegalArgumentException if a CompositeByteBuf is
created with < 2 components. Netty4Utils was updated to reflect this
change.
2018-11-30 09:02:10 -07:00
Jim Ferenczi 11fa5c626b [TEST] Fix random test failure in GetWatchResponseTests 2018-11-30 16:12:46 +01:00
Christophe Bismuth acdf9666d5 Add `minimum_should_match` section to the query_string docs
Closes #34142
2018-11-30 16:10:13 +01:00
Jim Ferenczi 08b9e31373 [TEST] fix link in get-watch documentation 2018-11-30 14:40:09 +01:00
patrykk21 bb2cf7e6be [Docs] Clarify search_after behavior
Closes #34232
2018-11-30 14:30:23 +01:00
Luca Cavanna 43ea498f2f [TEST] Reduce number of buckets created in InternalDateHistogramTests
New that we test with min_doc_count set to 0 as well, we may end up generating a lot more buckets. This commit adjusts the min bound and max bound, as well as the offset for each randomly generated agg instance so that we don't end up hitting the 10.000 max buckets limit.

Relates to #36064
2018-11-30 14:16:32 +01:00
Alpar Torok 6d4dfef64e
Conditional conffiles for packages (#36046)
Relates to #35810
2018-11-30 15:16:23 +02:00
Jim Ferenczi 5e6460acb3 [TEST] fix typo in get-watch documentation 2018-11-30 12:53:04 +01:00