42760 Commits

Author SHA1 Message Date
David Turner
034c7655b7
[Zen2] Reduce cluster scope in NodeDisconnectIT (#36168)
This test suite can stop all the shared master-eligible nodes, which breaks the
cluster since any non-shared master-eligible nodes are stopped first in the
reset process between tests.

Since this test suite can leave the cluster in this somewhat broken state, it
seems best that it uses a new cluster for each test.
2018-12-04 07:48:56 +00:00
David Turner
c01aecb4b1
[Zen2] Do not probe non-master nodes back (#36160)
Today if a node `A` sends a peers request to another node `B` then `B` will
react by sending a peers request back to `A`. However if `A` is not
master-eligible then this reaction is pointless and fails with an exception
saying `non-master-eligible node found`, adding noise to the logs. This change
suppresses this response to non-master-eligible nodes.
2018-12-03 17:19:18 +00:00
David Turner
8011438ea8 Use correct source of randomness
This fixes a failure of InternalTestClusterTests#testBeforeTest which checks
that the cluster is set up the same when starting from the same seed. Trappily,
using ESTestCase#randomIntBetween() is no good, we have to use
InternalTestCluster#random via RandomNumbers#randomIntBetween() instead.
2018-12-02 09:39:43 +00:00
David Turner
8bb1952975
Fix NodeJoinTests again (#36133)
In #36033 we removed a catch block because we thought we were preventing
exceptions by avoiding concurrent elections, missing the obvious fact that some
joins are supposed to be failing.

As a quick fix the catch was reinstated in 3a5dab6d8ef23ba3cb60162e9e2b7e002145
but this change adds finesse by only catching exceptions from the joins that we
expect to fail. It also inlines an always-false parameter to `initialState()`.
2018-12-01 09:54:01 +00:00
David Turner
9cc416bc46 Weaken assertion in PeerFinder
It can be inactive with no leader if it's handling an incoming PeersRequest
before being activated for the first time.
2018-12-01 07:20:19 +00:00
David Turner
3a5dab6d8e Reinstate catch removed in error in #36033 2018-12-01 07:10:19 +00:00
David Turner
8191348d6b
[Zen2] Only bootstrap a single node (#36119)
Today, we allow all nodes in an integration test to bootstrap. However this
seems to lead to test failures due to post-election instability. The change
avoids this instability by only bootstrapping a single node in the cluster.
2018-12-01 06:43:11 +00:00
Armin Braun
986bf52d1f
[Zen2] Allow Setting a List of Bootstrap Nodes to Wait for (#35847) 2018-11-30 18:53:08 +01:00
Armin Braun
48dc6c3442
[Zen2] Implement Tombstone REST APIs (#36007)
* [Zen2] Implement Tombstone REST APIs

* Adds REST API for withdrawing votes and clearing vote withdrawls
* Tests added to Netty4 module since we need a real Network impl. for Http endpoints
2018-11-29 14:34:10 +01:00
David Turner
7f257187af
[Zen2] Update default for USE_ZEN2 to true (#35998)
Today the default for USE_ZEN2 is false and it is overridden in many places. By
defaulting it to true we can be sure that the only places in which Zen2 does
not work are those in which it is explicitly set to false.
2018-11-29 12:18:35 +00:00
David Turner
277ccba3bd
[Zen2] fix NodeJoinTests#testConcurrentJoining() (#36033)
Today we sometimes create a setup in which the node is a quorum on its own,
which allows it to win a pre-voting round and schedule an election essentially
at will, causing it to discard all the joins it just received and fail the
test. This change excludes this case, preventing stray elections from ruining
things.
2018-11-29 12:11:08 +00:00
David Turner
87408b04d4
[Zen2] Only elect master-eligible nodes (#35996)
Today any node can win an election. However, the whole point of
master-eligibility is that master-ineligible nodes should not be elected as the
leader; furthermore master-ineligible nodes do not have any outgoing STATE
channels so cannot publish cluster states, so their leadership is ineffective
and disruptive.

This change ensures that the elected leader is master-eligible by preventing
master-ineligible nodes from scheduling an election.
2018-11-29 12:10:43 +00:00
Andrey Ershov
0b45fb98b9
[Zen2] Generate coordinationMetaData with different configs (#35991)
This PR fixes test failure, which is caused by equal randomly generated lastAcceptedConfiguration and lastCommittedConfguration.
2018-11-28 14:49:07 +01:00
Yannick Welsch
5f0c036183 Disable testDeleteCreateInOneBulk on Zen2
This test needs adaptation to run with Zen2
2018-11-28 13:34:48 +01:00
Andrey Ershov
0e283f9670
[Zen2] PersistedState interface implementation (#35819)
Today GatewayMetaState is capable of atomically storing MetaData to
disk. We've also moved fields that are needed to be persisted in Zen2
from ClusterState to ClusterState.MetaData.CoordinationMetaData.

This commit implements PersistedState interface.

version and currentTerm are persisted as a part of Manifest.
GatewayMetaState now implements both ClusterStateApplier and
PersistedState interfaces. We started with two descendants
Zen1GatewayMetaState and Zen2GatewayMetaState, but it turned
out to be not easy to glue it.
GatewayMetaState now constructs previousClusterState (including
MetaData) and previousManifest inside the constructor so that all
PersistedState methods are usable as soon as GatewayMetaState
instance is constructed. Also, loadMetaData is renamed to
getMetaData, because it just returns
previousClusterState.metaData().
Sadly, we don't have access to localNode (obtained from 
TransportService in the constructor, so getLastAcceptedState
should be called, after setLocalNode method is invoked.
Currently, when deciding whether to write IndexMetaData to disk,
we're comparing current IndexMetaData version and received
IndexMetaData version. This is not safe in Zen2 if the term has changed.
So updateClusterState now accepts incremental write
method parameter. When it's set to false, we always write
IndexMetaData to disk.
Things that are not covered by GatewayMetaStateTests are covered
by GatewayMetaStatePersistedStateTests.
This commit also adds an option to use GatewayMetaState instead of
InMemoryPersistedState in TestZenDiscovery. However, by default
InMemoryPersistedState is used and only one test in PersistedStateIT
used GatewayMetaState. In order to use it for other tests, proper
state recovery should be implemented.
2018-11-27 15:04:52 +01:00
David Turner
a68a46450b
[Zen2] Add lag detector (#35685)
A publication can succeed and complete before all nodes have applied the
published state and acknowledged it, thanks to the publication timeout; however
we need every node eventually either to apply the published state (or a later
state) or be removed from the cluster. This change introduces the LagDetector
which achieves this liveness property by removing any lagging nodes from the
cluster.
2018-11-26 10:52:49 +00:00
Andrey Ershov
f47636b254
[Zen2] Introduce VotingTombstone class (#35832)
Today voting tombstones are stored in CoordinationMetaData as
Set<DiscoveryNode>.
DiscoveryNode is not a lightweight object and have a lot of fields.
It also has toXContent method, but no fromXContent method and the
output of toXContent is not enough to re-create DiscoveryNode
object.
And votingTombstone set should be persisted as a part of MetaData.
On the other hand, the only thing required from the tombstone is the
nodeId.
This PR adds VotingTombstone class for voting tombstones, which
consists of two fields for now - nodeId and nodeName. It could be
extended/shrank in the future if needed.
This PR also resolves TODO's related to the voting tombstones xcontent
story.
Example of CoordinationMetaData.toXContent with voting tombstones:

{
  "term": 1,
  "last_committed_config": [
    "fkwLdOBvXSlgRTBfgNAL",
    "tmQiPGHvUxXzPkkCDSJo",
    "HhOmtQBZAThpHIGWhxpz",
    "qZHWGpoDNPYRNIiqKsDl"
  ],
  "last_accepted_config": [
    "lhqacKmriwhHGFZcvqbx",
    "MYysmBuROkvJRlDcusyd"
  ],
  "voting_tombstones": [
    {
      "node_id": "McjbZbRkEz",
      "node_name": "pdKIWeNJUO"
    },
    {
      "node_id": "cpXkVibGwo",
      "node_name": "UnCvFgdVsc"
    },
    {
      "node_id": "EylRNOztbc",
      "node_name": "ohOhkbMWZX"
    }
  ]
}
2018-11-23 18:34:06 +01:00
David Turner
cfdf666672
[Zen2] Fix test failures in diff-based publishing (#35684)
`testIncompatibleDiffResendsFullState` sometimes makes a 2-node cluster and
then partitions one of the nodes from the leader, which makes the leader stand
down.  Then when the partition is removed the cluster re-forms but does so by
sending full cluster states, not diffs, causing the test to fail.

Additionally `testDiffBasedPublishing` sometimes fails if a publication is
delivered out-of-order, wiping out a fresher last-received cluster state with a
less-fresh one. This is fixed here by passing the received cluster state to the
coordinator before recording it as the last-received one, relying on the
coordinator's freshness checks.
2018-11-22 09:08:52 +00:00
Yannick Welsch
c816347253 Disable testClusterJoinDespiteOfPublishingIssues for Zen2
This test is failing sometimes with Zen2 due to the lack of lag detection.
Zen1 does not have this problem as it only considers a join as valid if the
corresponding cluster state update is successfully published and committed
on the joining node.
2018-11-21 17:21:44 +01:00
Andrey Ershov
a056bd8c1c
[Zen2] Move ClusterState fields to be persisted to ClusterState.MetaData (#35625)
Today we have a way to atomically persist global MetaData and
IndexMetaData to disk when new ClusterState is received. All other
ClusterState fields are not persisted.
However, there are other parts of ClusterState that should be
persisted, namely:

version
term
lastCommittedConfiguration
lastAcceptedConfiguration
votingTombstones

version is changed frequently, other fields are not. We decided
to group term, lastCommittedConfiguration,
lastAcceptedConfiguration and votingTombstones into
CoordinationMetaData class and make CoordinationMetaData a field
inside MetaData.
MetaData.toXContent and MetaData.fromXContent should take care of
CoordinationMetaData.
version stays as a top level field in ClusterState and will be
persisted as part of Manifest in a follow-up commit.
Also MetaData.isGlobalStateEquals should be extended to include
coordinationMetaData in comparison.

This commit favors exposing getters, such as getTerm directly in
ClusterState to avoid massive code changes.

An example of CoordinationMetaState.toXContent:

{
  "term": 1,
  "last_committed_config": [
    "TiIuBcbBtpuXyDDVHXeD",
    "ZIAoVbkjjLPLUuYLaTkw"
  ],
  "last_accepted_config": [
    "OwkXbXZNOZPJqccdFHdz",
    "LouzsGYwmQzpeQMrboZe",
    "fCKGRZdjLTqzXAqPUtGL",
    "pLoxshjpJXwDhbgjfYJy",
    "SjINLwFIlIEFZCbjrSFo",
    "MDkVncJEVyZLJktopWje"
  ]
}
2018-11-21 17:03:26 +01:00
Andrey Ershov
6ac0cb1842 Merge branch master into zen2
2 types of conflicts during the merge:
1) Line length fix
2) Classes no longer extend AbstractComponent
2018-11-21 15:36:49 +01:00
Yannick Welsch
8939a7894f
Zen2: Move disruption tests to Zen2 (#35724)
- Moves disruption tests to Zen2
- Registers a few missing settings
- Removes .put(TestZenDiscovery.USE_ZEN2.getKey(), true) from tests where Zen2 is now enabled
by default through the parent test class
- Moves QuorumGatewayIT back to Zen1, as it is not stable with Zen2 as it currently relies on
dangling indices due to the lack of proper CS persistence, which triggers secondary failures
2018-11-21 14:43:33 +01:00
Armin Braun
bdf632b6f9
SNAPSHOTING+MINOR: Simplify SnapshortShardService (#35769) 2018-11-21 13:50:17 +01:00
Martijn van Groningen
9b2ab064cf
[HLRC] Added support for CCR Resume Follow API (#35638)
This change also adds documentation for the Resume Follow API

Relates to #33824
2018-11-21 10:52:03 +01:00
Armin Braun
7a210342ab
TESTS: Remove Dead Code in Disruption Tests (#35768)
* Neither this class nor the constructor are used anywhere
2018-11-21 10:33:50 +01:00
Dimitrios Liappis
1cb578b435
[DOCS] Update sysctl instructions for Docker on Mac (#35755)
Recent Docker for Mac releases[1] have a different path to the tty for
accessing the console of the xhyve vm, required for altering the
`vm.max_map_count` sysctl.

Update instructions on how to enter the xhyve vm for altering the
`vm.max_map_count` sysctl setting on Docker for Mac.

Closes #34817 

[1]
https://forums.docker.com/t/is-it-possible-to-ssh-to-the-xhyve-machine/17426/13
2018-11-21 11:32:29 +02:00
Christoph Büscher
6638708b56
Remove deprecated QueryStringQueryBuilder#splitOnWhiteSpace (#35763)
This parameter has been deprecated and was ignored since 6.0, so its Java API
methods can be removed.
2018-11-21 10:29:08 +01:00
Christoph Büscher
5847f8379c
Move ScoreAccessor to test-framework (#35766)
This class is only used by RandomScoreFunctionIT and the MockScriptEngine, so it
shouldn't be part of the server codebase.
2018-11-21 10:28:31 +01:00
Luca Cavanna
65d8fdf3da
Clean up PutLicenseResponse (#35689)
This commit removes the parsing code from the PutLicenseResponse server variant, and the toXContent portion from the corresponding client variant.

Relates to #35547
2018-11-21 10:21:45 +01:00
Luca Cavanna
778550a97d
Clean up StartBasicResponse (#35688)
This commit removes the parsing code from the PostStartBasicResponse server variant. It also makes the server response implement StatusToXContent which allows us to save a couple of lines of code in the corredponding REST action.

Relates to #35547
2018-11-21 10:21:10 +01:00
Ed Savage
4f857c4f8d
[HLRC][ML] Add ML revert model snapshot API (#35750)
Relates to #29827
2018-11-21 09:10:37 +00:00
Alan Woodward
f6a43b5939
Add a prebuilt ICU Analyzer (#34958)
The ICU plugin provides the building blocks of an analysis chain, but doesn't actually have a prebuilt analyzer. It would be a better for users if there was a simple analyzer that they could use out of the box, and also something we can point to from the CJK Analyzer docs as a superior alternative.

Relates to #34285
2018-11-21 09:00:48 +00:00
Ioannis Kakavas
e8ec4fad7b
[DOCS] Adjust Invalidate Token REST API docs (#35622)
- Renames API to Invalidate Token
- Explicitly calls out the possibility to invalidate refresh tokens
via this API
2018-11-21 09:32:56 +02:00
Martijn van Groningen
a6647a20a9
[HLRC] Added support for CCR Unfollow API (#35693)
This change also adds documentation for the Unfollow API

Relates to #33824
2018-11-21 07:48:29 +01:00
Tim Vernum
30c5422561
Move XContent generation to HasPrivilegesResponse (#35616)
The RestHasPrivilegesAction previously handled its own XContent
generation. This change moves that into HasPrivilegesResponse and
makes the response implement ToXContent.

This allows HasPrivilegesResponseTests to be used to test
compatibility between HLRC and X-Pack internals.

A serialization bug (cluster privs) was also fixed here.
2018-11-21 14:33:10 +11:00
Christoph Büscher
ff03443ab9
Fix problem with MatchNoDocsQuery in disjunction queries (#35726)
Queries across multiple fields generate MatchNoDocsQuerys for fields that are
unmapped. In certain situation this can lead to erroneous behaviour,
for example when an umapped field is used in a query_string query across
several fields. If some of the tokens in the query string get eliminated by an
analyzer on the mapped fields, the same token will currently generate
MatchNoDocsQuerys combined into a disjunction, which in turn
leads to no matches in the overall query. Instead we should simply not add
MatchNoDocsQuerys to those disjunctions.

Closes #34708
2018-11-21 03:49:49 +01:00
lcawl
fa6d8a5004 [DOCS] Fixes navtitle for CCR API 2018-11-20 17:31:01 -08:00
Marios Trivyzas
b1818dbdce
SQL: [docs] Add documentation for COALESCE (#35740)
Follows: #35253
2018-11-21 01:43:05 +01:00
Lisa Cawley
04a087fe2f
[DOCS] Adds authorization info for CCR APIs (#35557) 2018-11-20 16:23:01 -08:00
Benjamin Trent
84db1e42c0
HLRC: ML Get Calendar Events (#35747)
* HLRC: ML Get Calendar Events

* Addressing PR comments
2018-11-20 16:40:31 -06:00
Mayya Sharipova
643bb20137
Add a new query type - ScriptScoreQuery (#34533)
* Add a new query type - ScriptScoreQuery

script_score query uses script to calculate document scores.
Added as a substitute for function_score with an intentation
to deprecate function_scoreq query.

```http
GET /_search
{
    "query": {
        "script_score" : {
            "query": {
                "match": { "message": "elasticsearch" }
            },
            "script" : {
              "source": "Math.log(2 + doc['likes'].value)"
            },
            "min_score" : 2
        }
    }
}
```

Add several functions to painless to be used inside script_score:

double rational(double, double)
double sigmoid(double, double, double)
double randomNotReproducible() 
double randomReproducible(String, int) 

double decayGeoLinear(String, String, String, double, GeoPoint)
double decayGeoExp(String, String, String, double, GeoPoint)
double decayGeoGauss(String, String, String, double, GeoPoint)

double decayNumericLinear(String, String, String, double, double)
double decayNumericExp(String, String, String, double, double)
double decayNumericGauss(String, String, String, double, double)

double decayDateLinear(String, String, String, double, JodaCompatibleZonedDateTime)
double decayDateExp(String, String, String, double, JodaCompatibleZonedDateTime)
double decayDateGauss(String, String, String, double, JodaCompatibleZonedDateTime)

Date functions only works on dates in  the default format and default time zone
2018-11-20 16:10:06 -05:00
Ryan Ernst
c07ad67718 Add 6.5.2 version 2018-11-20 12:33:26 -08:00
Gordon Brown
4bda469861
Fix line lengths in misc other files (#35650) 2018-11-20 12:29:48 -07:00
Vladimir Dolzhenko
953c8586df
Fix PrimaryAllocationIT#testForceStaleReplicaToBePromotedToPrimary (#35728)
Closes 35497
2018-11-20 20:05:23 +01:00
Alpar Torok
328448e8cd
Implement a property to allow for splitting tests. (#35473) 2018-11-20 20:35:53 +02:00
Gordon Brown
17780ce07e
Add HLRC docs for Delete Lifecycle Policy (#35664)
Adds documenatation for the Delete Lifecycle Policy API to the HLRC
documentation.
2018-11-20 11:32:41 -07:00
Armin Braun
89cf4a7397
NETWORKING: Fix IpFiltering Test (#35730)
* The port assigned to all loopback interfaces doesn't necessarily have to be the same  for ipv4 and ipv6
=> use actual address from profile instead of just port + loopback in test
* Closes #35584
2018-11-20 17:50:08 +01:00
Christoph Büscher
e91f404d16
Remove remains of 'auto_generate_phrase_queries' (#35735)
This parameter in the `query_string` query was deprecated in 6.0 and ignored
since then. Its API methods and remaining uses can be removed in the upcoming
major version.

Relates to #35734
2018-11-20 16:07:17 +01:00
Tal Levy
d061b3999a
[ILM] HLRC-ILM Retry Lifecycle Policy docs (#35715)
this adds documentation for the retry method in the
high-level-ilm-rest-client.

this PR also renames retryLifecycleStep to retryLifecyclePolicy in the index-lifecycle-client
2018-11-20 07:05:27 -08:00
Armin Braun
33c713ba60
TESTS: More Logging in LongGcDisruptionTests (#35702)
* The existing logging is not helpful enough to track down which threads hang, we need the hanging thread's stacktraces too
* Relates #35686
2018-11-20 15:36:01 +01:00