1745 Commits

Author SHA1 Message Date
lipsill
185c06bb7f Logging: tests: clean up logging (#34606)
Replace internal deprecated calls to `Loggers.getLogger(Class)`
with direct calls to log4j `LogManager.getLogger(Class)`
2018-10-25 09:52:41 -04:00
Alpar Torok
59536966c2
Add a new "contains" feature (#34738)
The contains syntax was added in #30874 but the skips were not properly
put in place.
The java runner has the feature so the tests will run as part of the
build, but language clients will be able to support it at their own
pace.
2018-10-25 08:50:50 +03:00
Ryan Ernst
687dc1eb11
Scripting: Remove SearchScript (#34730)
This commit removes the last non context based script class.
2018-10-24 15:03:38 -07:00
Luca Cavanna
d51bc05dce
[TEST] Improve validation of do sections (#34734)
We throw parsing exception when an unknown array is found, but we don't when an unknown top-level field is found. This commit makes sure that unsupported top-level fields are not ignored in a do section.

Closes #34651
2018-10-24 21:27:07 +02:00
lipsill
d5ad3de42e [test] Introduce strict deprecation mode for REST tests (#34338)
#33708 introduced a strict deprecation mode that makes a REST request
fail if there is a warning header in the response returned by
Elasticsearch (usually a deprecation message signaling that a feature
or a field has been deprecated).

This change adds the strict deprecation mode into the REST integration
tests, and makes the tests fail if a deprecated feature is used. Also
any test using a deprecated feature has been modified to pass the build.

The YAML integration tests already analyzed HTTP warnings so they do
not use this mode, keeping their "expected vs actual" behavior.
2018-10-24 08:21:24 -04:00
Nhat Nguyen
52266d8b11
TEST: Clone replicas list when compute replication targets (#34728)
In #34407, we supposed to clone the list of replicas of ReplicationGroup
when computing replication targets, but somehow we missed it. If we
don't clone the list, a WriteReplicationAction may use an old
ReplicationTargets which consists replicas which are removed from the
current list of replicas

Relates #34407
Closes #33457
2018-10-23 21:08:34 -04:00
Zachary Tong
299d044bfc
Collapse pipeline aggs into single package (#34658)
- Restrict visibility of Aggregators and Factories
- Move PipelineAggregatorBuilders up a level so it is consistent with
AggregatorBuilders
- Checkstyle line length fixes for a few classes
- Minor odds/ends (swapping to method references, formatting, etc)
2018-10-23 16:01:01 -04:00
Tal Levy
62ac2fa5ec Merge remote-tracking branch 'upstream/master' into index-lifecycle 2018-10-23 09:43:46 -07:00
Zachary Tong
4dbf498721
[Rollup] Job deletion should be invoked on the allocated task (#34574)
We should delete a job by directly talking to the allocated 
task and telling it to shutdown. Today we shut down a job 
via the persistent task framework. This is not ideal because, 
while the job has been removed from the persistent task 
CS, the allocated task continues to live until it gets the 
shutdown message.

This means a user can delete a job, immediately delete 
the rollup index, and then see new documents appear in
 the just-deleted index. This happens because the indexer
 in the allocated task is still running and indexes a few 
more documents before getting the shutdown command.

In this PR, the transport action is changed to a TransportTasksAction, 
and we invoke onCancelled() directly on the matching job. 
The race condition still exists after this PR (albeit less likely), 
but this was a precursor to fixing the issue and a self-contained
chunk of code. A second PR will followup to fix the race itself.
2018-10-23 12:23:22 -04:00
Tal Levy
67bfdb16ad Merge remote-tracking branch 'upstream/master' into index-lifecycle 2018-10-22 13:09:37 -07:00
Yannick Welsch
6d6ac74a08
Zen2: Fail fast on disconnects (#34503)
Integrates the failure detectors with the Connection lifecycle, to fail nodes as soon as:
- a leader detects one of his followers disconnecting.
- a follower detects its leader disconnecting.
2018-10-22 17:20:12 +02:00
Jason Tedor
243335e2ba
Allow set section in setup section of REST tests (#34678)
This commit enables using a set section in the setup section of REST
tests.
2018-10-22 11:14:27 -04:00
Jason Tedor
7af19b8f81
Migrate wait for pending tasks helper to server (#34675)
In some of our X-Pack REST tests we have to wait for pending tasks to
complete. We are now needing this functionality in ESRestTestCase for
the docs tests where we run against X-Pack features. This commit moves
the helper method that we have in X-Pack to ESRestTestCase, and removes
duplicate logic from waiting for rollup tasks to complete.
2018-10-22 11:14:02 -04:00
Ryan Ernst
222652dfce
Scripting: Convert script fields to use script context (#34164)
This commit removes the use of SearchScript for script fields and adds
a new FieldScript.
2018-10-20 16:33:49 -07:00
David Turner
bfd24fc030
[Zen2] Reconfigure cluster as its membership changes (#34592)
As master-eligible nodes join or leave the cluster we should give them votes or
take them away, in order to maintain the optimal level of fault-tolerance in
the system. #33924 introduced the `Reconfigurator` to calculate the optimal
configuration of the cluster, and in this change we add the plumbing needed to
actually perform the reconfigurations needed as the cluster grows or shrinks.
2018-10-19 19:24:54 +01:00
Nhat Nguyen
bd92a28cfc
CCR: Replicate existing ops with old term on follower (#34412)
Since #34288, we might hit deadlock if the FollowTask has more fetchers
than writers. This can happen in the following scenario:

Suppose the leader has two operations [seq#0, seq#1]; the FollowTask has
two fetchers and one writer.

1. The FollowTask issues two concurrent fetch requests: {from_seq_no: 0,
num_ops:1} and {from_seq_no: 1, num_ops:1} to read seq#0 and seq#1
respectively.

2. The second request which fetches seq#1 completes before, and then it
triggers a write request containing only seq#1.

3. The primary of a follower fails after it has replicated seq#1 to
replicas.

4. Since the old primary did not respond, the FollowTask issues another
write request containing seq#1 (resend the previous write request).

5. The new primary has seq#1 already; thus it won't replicate seq#1 to
replicas but will wait for the global checkpoint to advance at least
seq#1.

The problem is that the FollowTask has only one writer and that writer
is waiting for seq#0 which won't be delivered until the writer completed.

This PR proposes to replicate existing operations with the old primary
term (instead of the current term) on the follower. In particular, when
the following primary detects that it has processed an process already,
it will look up the term of an existing operation with the same seq_no
in the Lucene index, then rewrite that operation with the old term
before replicating it to the following replicas. This approach is
wait-free but requires soft-deletes on the follower.

Relates #34288
2018-10-19 13:56:00 -04:00
David Turner
3de266e3cf Merge branch 'master' into zen2 2018-10-19 14:30:07 +01:00
Colin Goodheart-Smithe
84ef91529c
Merge branch 'master' into index-lifecycle 2018-10-19 13:24:04 +01:00
Daniel Mitterdorfer
dbb6fe58fa
Remove hand-coded XContent duplicate checks
With this commit we cleanup hand-coded duplicate checks in XContent
parsing. They were necessary previously but since we reconfigured the
underlying parser in #22073 and #22225, these checks are obsolete and
were also ineffective unless an undocumented system property has been
set. As we also remove this escape hatch, we can remove the additional
checks as well.

Closes #22253
Relates #34588
2018-10-19 10:13:13 +02:00
Tal Levy
09067c8942 Merge remote-tracking branch 'upstream/master' into index-lifecycle 2018-10-17 15:37:11 -07:00
Nhat Nguyen
eb36f10394
TEST: Capture replication targets when replication group ready (#34407)
Today, WriteReplicationAction uses a set of replication targets directly
from the primary shard of ReplicationGroup. It should be fine except
when we add/remove or promote a shard while a write action is executing.
We have encountered these two issues:

1. Replicas are not found in the replication targets. This happens
because we remove replicas but the WriteReplicationAction still uses the
old replication targets which include the removed replicas.

2. Access ReplicationGroup from a primary shard which hasn't activated
the primary-mode yet. This is because we won't activate the primary-mode
for a promoting shard after bumping the primary term which is executed
asynchronously.

This commit captures the replication targets when the replication group
is ready and continue using those targets until we re-compute the new
targets after the group is changed.

Closes #33457
2018-10-17 17:37:52 -04:00
Armin Braun
08d4bf6e84
TESTS: Remove Dead Code in Test Infra. (#34548)
* None of this infrastructure is used
* Some redundant throws and resulting catch code removed
2018-10-17 20:08:39 +01:00
Colin Goodheart-Smithe
90f7cec7a5
Merge branch 'master' into index-lifecycle 2018-10-17 18:22:23 +01:00
Nik Everett
139bbc3f03
Rollup: Consolidate rollup cleanup for http tests (#34342)
This moves the rollup cleanup code for http tests from the high level rest
client into the test framework and then entirely removes the rollup cleanup
code for http tests that lived in x-pack. This is nice because it
consolidates the cleanup into one spot, automatically invokes the cleanup
without the test having to know that it is "about rollup", and should allow
us to run the rollup docs tests.

Part of #34530
2018-10-17 09:32:16 -04:00
Andrey Ershov
93bb24e1f8 Merge branch 'master' into zen2 2018-10-17 14:37:53 +02:00
Armin Braun
3954d041a0
SCRIPTING: Move sort Context to its Own Class (#33717)
* SCRIPTING: Move sort Context to its own Class
2018-10-17 10:02:44 +01:00
Tal Levy
fbe8dc014c Merge branch 'master' into index-lifecycle 2018-10-16 13:58:53 -07:00
Armin Braun
ea576a8ca2
Disc: Move AbstractDisruptionTC to filebased D. (#34461)
* Discovery: Move AbstractDisruptionTestCase to file-based discovery.
* Relates #33675
* Simplify away ClusterDiscoveryConfiguration
2018-10-16 15:28:40 +01:00
David Turner
950ca3adda Merge branch 'master' into zen2 2018-10-16 14:41:14 +01:00
Simon Willnauer
d43a1fac33
Lock down Engine.Searcher (#34363)
`Engine.Searcher` is non-final today which makes it error prone
in the case of wrapping the underlying reader or lucene `IndexSearcher`
like we do in `IndexSearcherWrapper`. Yet, there is no subclass of it yet
that would be dramatic to just drop on the floor. With the start of development
of frozen indices this changed since in #34357 functionality was added to
a subclass which would be dropped if a `IndexSearcherWrapper` is installed on an index.
This change locks down the `Engine.Searcher` to prevent such a functionality trap.
2018-10-16 14:53:07 +02:00
Martijn van Groningen
a1ec91395c
Changed CCR internal integration tests to use a leader and follower cluster instead of a single cluster (#34344)
The `AutoFollowTests` needs to restart the clusters between each tests, because
it is using auto follow stats in assertions. Auto follow stats are only reset
by stopping the elected master node.

Extracted the `testGetOperationsBasedOnGlobalSequenceId()` test to its own test, because it just tests the shard changes api.

* Renamed AutoFollowTests to AutoFollowIT, because it is an integration test.
Renamed ShardChangesIT to IndexFollowingIT, because shard changes it the name
of an internal api and isn't a good name for an integration test.

* move creation of NodeConfigurationSource to a seperate method

* Fixes issues after merge, moved assertSeqNos() and assertSameDocIdsOnShards() methods from ESIntegTestCase to InternalTestCluster, so that ccr tests can use these methods too.
2018-10-16 14:45:46 +02:00
Jim Ferenczi
544de13d8e
Disallow negative query boost (#34486)
This change disallows negative query boosts. Negative scores are not allowed in Lucene 8 so
it is easier to just disallow negative boosts entirely. We should also deprecate negative boosts
in 6x in order to ensure that users are aware when they'll upgrade to ES 7.

Relates #33309
2018-10-16 11:31:53 +01:00
Armin Braun
ebca27371c
SCRIPTING: Move Aggregation Script Context to its own class (#33820)
* SCRIPTING: Move Aggregation Script Context to its own class
2018-10-15 17:28:05 +01:00
Colin Goodheart-Smithe
0b42eda0e3
Merge branch 'master' into index-lifecycle 2018-10-15 16:03:37 +01:00
Andrey Ershov
e3a1981a57 Mute testToQuery test 2018-10-15 14:08:04 +02:00
Yannick Welsch
5fbead00a3
Zen2: Add infrastructure for integration tests (#34365)
Adds the infrastructure to run integration tests against Zen2.
2018-10-14 20:55:04 +01:00
Nhat Nguyen
33791ac27c
CCR: Following primary should process operations once (#34288)
Today we rewrite the operations from the leader with the term of the
following primary because the follower should own its history. The
problem is that a newly promoted primary may re-assign its term to
operations which were replicated to replicas before by the previous
primary. If this happens, some operations with the same seq_no may be
assigned different terms. This is not good for the future optimistic
locking using a combination of seqno and term.

This change ensures that the primary of a follower only processes an
operation if that operation was not processed before. The skipped
operations are guaranteed to be delivered to replicas via either
primary-replica resync or peer-recovery. However, the primary must not
acknowledge until the global checkpoint is at least the highest seqno of
all skipped ops (i.e., they all have been processed on every replica).

Relates #31751
Relates #31113
2018-10-10 15:39:57 -04:00
Nik Everett
06993e0c35
Logging: Make ESLoggerFactory package private (#34199)
Since all calls to `ESLoggerFactory` outside of the logging package were
deprecated, it seemed like it'd simplify things to migrate all of the
deprecated calls and declare `ESLoggerFactory` to be package private.
This does that.
2018-10-06 09:54:08 -04:00
David Turner
c6b0f08472
Add safety phase to CoordinatorTests (#34241)
Today's CoordinatorTests have a limited amount of randomisation in how things
are scheduled. However, to be fully confident in Zen2's liveness we require the
system to stabilise after any permitted sequence of events. We can achieve
this by running the system in a much more random fashion for a while, with much
larger variation in when things are scheduled (simulating GC pressure and
network disruption) and then continuing to assert that the system stabilises as
we expect. When running randomly, we do not expect to make significant progress
and merely verify that no safety property is violated.

This change introduces the runRandomly() test method which implements this
idea. It also fixes a handful of liveness bugs that this first version of
runRandomly() exposed.
2018-10-04 07:40:26 +01:00
Kazuhiro Sera
d45fe43a68 Fix a variety of typos and misspelled words (#32792) 2018-10-03 18:11:38 +01:00
David Turner
a9eae1d068 Merge branch 'master' into zen2 2018-10-03 08:36:34 +01:00
Gordon Brown
fb907706ec Merge branch 'master' into index-lifecycle 2018-10-02 13:43:46 -06:00
Nik Everett
f904c41506
HLRC: Add get rollup job (#33921)
Adds support for the get rollup job to the High Level REST Client. I had
to do three interesting and unexpected things:
1. I ported the rollup state wiping code into the high level client
tests. I'll move this into the test framework in a followup and remove
the x-pack version.
2. The `timeout` in the rollup config was serialized using the
`toString` representation of `TimeValue` which produces fractional time
values which are more human readable but aren't supported by parsing. So
I switched it to `getStringRep`.
3. Refactor the xcontent round trip testing utilities so we can test
parsing of classes that don't implements `ToXContent`.
2018-10-02 09:11:29 -04:00
David Turner
a127805b4a
[Zen2] Simulate scheduling delays (#34181)
Today we schedule tasks (both immediate and future ones) exactly when
requested. In fact it is more realistic to allow for a small amount of delay in
the scheduling of tasks, and this helps to exercise more interleavings of
actions and therefore to improve test coverage.

This change adds to the DeterministicTaskQueue the ability to add a random
delay to the scheduling of tasks.

This change also provides more explicit timeouts for stabilisation in the
CoordinatorTests.

Using the randomised scheduling feature in the CoordinatorTests also found a
situation in which we could become a leader, then a candidate, and then a
leader again very quickly, causing a clash of the _BECOME_MASTER_ and
_FINISH_ELECTION_ tasks. We change their behaviour to not consider these
duplicates to be problematic.
2018-10-02 11:22:05 +01:00
Lee Hinman
2d9cb21490 Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-10-01 14:10:09 -06:00
Nhat Nguyen
ad61398879
CCR: Optimize indexing ops using seq_no on followers (#34099)
This change introduces the indexing optimization using sequence numbers
in the FollowingEngine. This optimization uses the max_seq_no_updates
which is tracked on the primary of the leader and replicated to replicas
and followers.

Relates #33656
2018-09-28 20:42:26 -04:00
Ryan Ernst
47cbae9b26
Scripting: Remove ExecutableScript (#34154)
This commit removes the legacy ExecutableScript, which was no longer
used except in tests. All uses have previously been converted to script
contexts.
2018-09-28 17:13:08 -07:00
Lee Hinman
6ea396a476 Merge remote-tracking branch 'origin/master' into index-lifecycle 2018-09-28 15:40:12 -06:00
Hendrik Muhs
e2f310b56c
Fix AggregationFactories.Builder equality and hash regarding order (#34005)
Fixes the equals and hash function to ignore the order of aggregations to ensure equality after serialization
and deserialization. This ensures storing configs with aggregation works properly.

This also addresses a potential issue in caching when the same query contains aggregations but in 
different order. 1st it will not hit in the cache, 2nd cache objects which shall be equal might end up twice in 
the cache.
2018-09-28 13:30:50 +02:00
Alan Woodward
f243d75f59
Remove special-casing of Synonym filters in AnalysisRegistry (#34034)
The synonym filters no longer need access to the AnalysisRegistry in their
constructors, so we can remove the special-case code and move them to the
common analysis module.

This commit means that synonyms are no longer available for `server` integration tests,
so several of these are either rewritten or migrated to the common analysis module
as rest-spec-api tests
2018-09-28 09:02:47 +01:00