Commit Graph

1655 Commits

Author SHA1 Message Date
Andrey Ershov 93bb24e1f8 Merge branch 'master' into zen2 2018-10-17 14:37:53 +02:00
Armin Braun 3954d041a0
SCRIPTING: Move sort Context to its Own Class (#33717)
* SCRIPTING: Move sort Context to its own Class
2018-10-17 10:02:44 +01:00
Armin Braun ea576a8ca2
Disc: Move AbstractDisruptionTC to filebased D. (#34461)
* Discovery: Move AbstractDisruptionTestCase to file-based discovery.
* Relates #33675
* Simplify away ClusterDiscoveryConfiguration
2018-10-16 15:28:40 +01:00
David Turner 950ca3adda Merge branch 'master' into zen2 2018-10-16 14:41:14 +01:00
Simon Willnauer d43a1fac33
Lock down Engine.Searcher (#34363)
`Engine.Searcher` is non-final today which makes it error prone
in the case of wrapping the underlying reader or lucene `IndexSearcher`
like we do in `IndexSearcherWrapper`. Yet, there is no subclass of it yet
that would be dramatic to just drop on the floor. With the start of development
of frozen indices this changed since in #34357 functionality was added to
a subclass which would be dropped if a `IndexSearcherWrapper` is installed on an index.
This change locks down the `Engine.Searcher` to prevent such a functionality trap.
2018-10-16 14:53:07 +02:00
Martijn van Groningen a1ec91395c
Changed CCR internal integration tests to use a leader and follower cluster instead of a single cluster (#34344)
The `AutoFollowTests` needs to restart the clusters between each tests, because
it is using auto follow stats in assertions. Auto follow stats are only reset
by stopping the elected master node.

Extracted the `testGetOperationsBasedOnGlobalSequenceId()` test to its own test, because it just tests the shard changes api.

* Renamed AutoFollowTests to AutoFollowIT, because it is an integration test.
Renamed ShardChangesIT to IndexFollowingIT, because shard changes it the name
of an internal api and isn't a good name for an integration test.

* move creation of NodeConfigurationSource to a seperate method

* Fixes issues after merge, moved assertSeqNos() and assertSameDocIdsOnShards() methods from ESIntegTestCase to InternalTestCluster, so that ccr tests can use these methods too.
2018-10-16 14:45:46 +02:00
Jim Ferenczi 544de13d8e
Disallow negative query boost (#34486)
This change disallows negative query boosts. Negative scores are not allowed in Lucene 8 so
it is easier to just disallow negative boosts entirely. We should also deprecate negative boosts
in 6x in order to ensure that users are aware when they'll upgrade to ES 7.

Relates #33309
2018-10-16 11:31:53 +01:00
Armin Braun ebca27371c
SCRIPTING: Move Aggregation Script Context to its own class (#33820)
* SCRIPTING: Move Aggregation Script Context to its own class
2018-10-15 17:28:05 +01:00
Andrey Ershov e3a1981a57 Mute testToQuery test 2018-10-15 14:08:04 +02:00
Yannick Welsch 5fbead00a3
Zen2: Add infrastructure for integration tests (#34365)
Adds the infrastructure to run integration tests against Zen2.
2018-10-14 20:55:04 +01:00
Nhat Nguyen 33791ac27c
CCR: Following primary should process operations once (#34288)
Today we rewrite the operations from the leader with the term of the
following primary because the follower should own its history. The
problem is that a newly promoted primary may re-assign its term to
operations which were replicated to replicas before by the previous
primary. If this happens, some operations with the same seq_no may be
assigned different terms. This is not good for the future optimistic
locking using a combination of seqno and term.

This change ensures that the primary of a follower only processes an
operation if that operation was not processed before. The skipped
operations are guaranteed to be delivered to replicas via either
primary-replica resync or peer-recovery. However, the primary must not
acknowledge until the global checkpoint is at least the highest seqno of
all skipped ops (i.e., they all have been processed on every replica).

Relates #31751
Relates #31113
2018-10-10 15:39:57 -04:00
Nik Everett 06993e0c35
Logging: Make ESLoggerFactory package private (#34199)
Since all calls to `ESLoggerFactory` outside of the logging package were
deprecated, it seemed like it'd simplify things to migrate all of the
deprecated calls and declare `ESLoggerFactory` to be package private.
This does that.
2018-10-06 09:54:08 -04:00
David Turner c6b0f08472
Add safety phase to CoordinatorTests (#34241)
Today's CoordinatorTests have a limited amount of randomisation in how things
are scheduled. However, to be fully confident in Zen2's liveness we require the
system to stabilise after any permitted sequence of events. We can achieve
this by running the system in a much more random fashion for a while, with much
larger variation in when things are scheduled (simulating GC pressure and
network disruption) and then continuing to assert that the system stabilises as
we expect. When running randomly, we do not expect to make significant progress
and merely verify that no safety property is violated.

This change introduces the runRandomly() test method which implements this
idea. It also fixes a handful of liveness bugs that this first version of
runRandomly() exposed.
2018-10-04 07:40:26 +01:00
Kazuhiro Sera d45fe43a68 Fix a variety of typos and misspelled words (#32792) 2018-10-03 18:11:38 +01:00
David Turner a9eae1d068 Merge branch 'master' into zen2 2018-10-03 08:36:34 +01:00
Nik Everett f904c41506
HLRC: Add get rollup job (#33921)
Adds support for the get rollup job to the High Level REST Client. I had
to do three interesting and unexpected things:
1. I ported the rollup state wiping code into the high level client
tests. I'll move this into the test framework in a followup and remove
the x-pack version.
2. The `timeout` in the rollup config was serialized using the
`toString` representation of `TimeValue` which produces fractional time
values which are more human readable but aren't supported by parsing. So
I switched it to `getStringRep`.
3. Refactor the xcontent round trip testing utilities so we can test
parsing of classes that don't implements `ToXContent`.
2018-10-02 09:11:29 -04:00
David Turner a127805b4a
[Zen2] Simulate scheduling delays (#34181)
Today we schedule tasks (both immediate and future ones) exactly when
requested. In fact it is more realistic to allow for a small amount of delay in
the scheduling of tasks, and this helps to exercise more interleavings of
actions and therefore to improve test coverage.

This change adds to the DeterministicTaskQueue the ability to add a random
delay to the scheduling of tasks.

This change also provides more explicit timeouts for stabilisation in the
CoordinatorTests.

Using the randomised scheduling feature in the CoordinatorTests also found a
situation in which we could become a leader, then a candidate, and then a
leader again very quickly, causing a clash of the _BECOME_MASTER_ and
_FINISH_ELECTION_ tasks. We change their behaviour to not consider these
duplicates to be problematic.
2018-10-02 11:22:05 +01:00
Nhat Nguyen ad61398879
CCR: Optimize indexing ops using seq_no on followers (#34099)
This change introduces the indexing optimization using sequence numbers
in the FollowingEngine. This optimization uses the max_seq_no_updates
which is tracked on the primary of the leader and replicated to replicas
and followers.

Relates #33656
2018-09-28 20:42:26 -04:00
Ryan Ernst 47cbae9b26
Scripting: Remove ExecutableScript (#34154)
This commit removes the legacy ExecutableScript, which was no longer
used except in tests. All uses have previously been converted to script
contexts.
2018-09-28 17:13:08 -07:00
Hendrik Muhs e2f310b56c
Fix AggregationFactories.Builder equality and hash regarding order (#34005)
Fixes the equals and hash function to ignore the order of aggregations to ensure equality after serialization
and deserialization. This ensures storing configs with aggregation works properly.

This also addresses a potential issue in caching when the same query contains aggregations but in 
different order. 1st it will not hit in the cache, 2nd cache objects which shall be equal might end up twice in 
the cache.
2018-09-28 13:30:50 +02:00
Alan Woodward f243d75f59
Remove special-casing of Synonym filters in AnalysisRegistry (#34034)
The synonym filters no longer need access to the AnalysisRegistry in their
constructors, so we can remove the special-case code and move them to the
common analysis module.

This commit means that synonyms are no longer available for `server` integration tests,
so several of these are either rewritten or migrated to the common analysis module
as rest-spec-api tests
2018-09-28 09:02:47 +01:00
Ryan Ernst a2c941806b
Tests: Add support for custom contexts to mock scripts (#34100)
This commit adds the ability to plug in compilation of custom contexts
in mock script engine. This is needed for testing plugins which add
custom contexts like watcher.
2018-09-27 12:23:59 -07:00
Jim Ferenczi 269ae0bc15
Handle MatchNoDocsQuery in span query wrappers (#34106)
* Handle MatchNoDocsQuery in span query wrappers

This change adds a new SpanMatchNoDocsQuery query that replaces
MatchNoDocsQuery in the span query wrappers.
The `wildcard` query now returns MatchNoDocsQuery if the target field is not
in the mapping (#34093) so we need the equivalent span query in order to
be able to pass it to other span wrappers.

Closes #34105
2018-09-27 14:19:08 +02:00
Simon Willnauer bda7bc145b
Fold EngineSearcher into Engine.Searcher (#34082)
EngineSearcher can be easily folded into Engine.Searcher which removes
a level of inheritance that is necessary for most of it's subclasses.
This change folds it into Engine.Searcher and removes the dependency on
ReferenceManager.
2018-09-27 09:06:04 +02:00
Yogesh Gaikwad 0301062c6e Mute SpanMultiTermQueryBuilderTests#testToQuery 2018-09-27 15:26:06 +10:00
Nik Everett ddce9704d4
Logging: Drop two deprecated methods (#34055)
This drops two deprecated methods from `ESLoggerFactory`, switching all
calls to those methods to calls to methods of the same name on
`LogManager`.
2018-09-26 11:20:52 -04:00
Adrien Grand 3c2841d493
REST test for typeless APIs. (#33934)
This commit duplicates REST tests for the
 - `indices.create`
 - `indices.put_mapping`
 - `indices.get_mapping`
 - `index`
 - `get`
 - `delete`
 - `update`
 - `bulk`
APIs, so that we both test them when used without types (include_type_name=false)
and with types, mostly for mixed-version cluster tests.

Given a suite called `X_test_name.yml`, I first copied it to
`(X+1)_test_name_with_types.yml` and then changed `X_test_name.yml` to set
`include_type_name=false` on every API that supports it.

Relates #15613
2018-09-26 17:11:37 +02:00
Ryan Ernst 7800b4fa91
Core: Abstract DateMathParser in an interface (#33905)
This commits creates a DateMathParser interface, which is already
implemented for both joda and java time. While currently the java time
DateMathParser is not used, this change will allow a followup which will
create a DateMathParser from a DateFormatter, so the caller does not
need to know the internals of the DateFormatter they have.
2018-09-26 07:56:25 -07:00
Zachary Tong 25d74bd0cb
Prefer mapped aggs to lead reductions (#33528)
Previously, unmapped aggs try to delegate reduction to a sibling agg that is
mapped.  That delegated agg will run the reductions, and also
reduce any pipeline aggs.  But because delegation comes before running
pipelines, the unmapped agg _also_ tries to run pipeline aggs.

This causes the pipeline to run twice, and potentially double it's output
in buckets which can create invalid JSON (e.g. same key multiple times)
and break when converting to maps.

This fixes by sorting the list of aggregations ahead of time so that mapped
aggs appear first, meaning they preferentially lead the reduction.  If all aggs
are unmapped, the first unmapped agg simply creates a new unmapped object
and returns that for the reduction.

This means that unmapped aggs no longer defer and there is no chance for 
a secondary execution of pipelines (or other side effects caused by deferring
execution).

Closes #33514
2018-09-26 10:09:31 -04:00
Christoph Büscher ba3ceeaccf
Clean up "unused variable" warnings (#31876)
This change cleans up "unused variable" warnings. There are several cases were we 
most likely want to suppress the warnings (especially in the client documentation test
where the snippets contain many unused variables). In a lot of cases the unused
variables can just be deleted though.
2018-09-26 14:09:32 +02:00
David Turner d995fc85c6
Integrate LeaderChecker with Coordinator (#34049)
This change ensures that follower nodes periodically check that their leader is
healthy, and that they elect a new leader if not.
2018-09-26 12:18:13 +01:00
Ryan Ernst be8475955e
Scripting: Use ParameterMap for deprecated ctx var in update scripts (#34065)
This commit removes the sysprop controlling whether ctx is in params for
update scripts and replaces it with use of the new ParameterMap, which
outputs a deprecation warning whenever params.ctx is used.
2018-09-25 22:08:02 -07:00
Nhat Nguyen 5166dd0a4c
Replicate max seq_no of updates to replicas (#33967)
We start tracking max seq_no_of_updates on the primary in #33842. This
commit replicates that value from a primary to its replicas in replication 
requests or the translog phase of peer-recovery.

With this change, we guarantee that the value of max seq_no_of_updates
on a replica when any index/delete operation is performed at least the
max_seq_no_of_updates on the primary when that operation was executed.

Relates #33656
2018-09-25 08:07:57 -04:00
David Turner 1d47c9582b
Fix CoordinatorTests (#34002)
Today the CoordinatorTests are not very reliable if two elections are scheduled
concurrently. Although we expect occasional failures due to this, in fact the
failures are much more common than expected due to a handful of issues. This PR
fixes these issues.
2018-09-25 08:43:47 +01:00
Tim Brooks 78e483e8d8
Introduce abstract security transport testcase (#33878)
This commit introduces an AbstractSimpleSecurityTransportTestCase for
security transports. This classes provides transport tests that are
specific for security transports. Additionally, it fixes the tests referenced in
#33285.
2018-09-24 09:44:44 -06:00
Nhat Nguyen 7944a0cb25
Track max seq_no of updates or deletes on primary (#33842)
This PR is the first step to use seq_no to optimize indexing operations.
The idea is to track the max seq_no of either update or delete ops on a
primary, and transfer this information to replicas, and replicas use it
to optimize indexing plan for index operations (with assigned seq_no).

The max_seq_no_of_updates on primary is initialized once when a primary
finishes its local recovery or peer recovery in relocation or being
promoted. After that, the max_seq_no_of_updates is only advanced internally
inside an engine when processing update or delete operations.

Relates #33656
2018-09-22 08:02:57 -04:00
Vladimir Dolzhenko 477391d751
Don't test corruption detection within CFS checksum (#33911)
Closes #33881
2018-09-22 10:21:36 +02:00
Yannick Welsch a612dd1272
Zen2: Add node id to log output of CoordinatorTests (#33929)
With recent changes to the logging framework, the node name can no longer be injected into the logging output using the node.name setting, which means that for the CoordinatorTests (which are simulating a cluster in a fully deterministic fashion using a single thread), as all the different nodes are running under the same test thread, we are not able to distinguish which log lines are coming from which node. This commit readds logging for node ids in the CoordinatorTests, making two very small changes to DeterministicTaskQueue and TestThreadInfoPatternConverter.
2018-09-21 18:40:12 +02:00
lipsill b48d5a8942 [TEST] ClientYamlSuiteRestApiParser to parse spec without path parts (#33720)
Previously ClientYamlSuiteRestApiParser threw an exception when an api
spec contained neither path parts nor url parameter sections.

Closes #31649
2018-09-21 17:26:55 +02:00
Alexander Reelsen 1de2a925ce
Watcher: Ensure that execution triggers properly on initial setup (#33360)
This commit reverts most of #33157 as it introduces another race
condition and breaks a common case of watcher, when the first watch is
added to the system and the index does not exist yet.

This means, that the index will be created, which triggers a reload, but
during this time the put watch operation that triggered this is not yet
indexed, so that both processes finish roughly add the same time and
should not overwrite each other but act complementary.

This commit reverts the logic of cleaning out the ticker engine watches
on start up, as this is done already when the execution is paused -
which also gets paused on the cluster state listener again, as we can be
sure here, that the watches index has not yet been created.

This also adds a new test, that starts a one node cluster and emulates
the case of a non existing watches index and a watch being added, which
should result in proper execution.

Closes #33320
2018-09-21 14:22:34 +02:00
Armin Braun 3a5b8a71b4
NETWORKING: Fix Portability of SO_LINGER=0 in Tests (#33895)
* Setting SO_LINGER for open but not connected non-blocking sockets
throws on OSX
  * Fixed by only applying setting to connected sockets which will save
the same number of FDs as doing it on open sockets anyway
* closes #33879
2018-09-21 10:08:16 +02:00
Nhat Nguyen 5f7f793f43
Propagate max_auto_id_timestamp in peer recovery (#33693)
Today we don't store the auto-generated timestamp of append-only
operations in Lucene; and assign -1 to every index operations
constructed from LuceneChangesSnapshot. This looks innocent but it
generates duplicate documents on a replica if a retry append-only
arrives first via peer-recovery; then an original append-only arrives
via replication. Since the retry append-only (delivered via recovery)
does not have timestamp, the replica will happily optimizes the original
request while it should not.

This change transmits the max auto-generated timestamp from the primary
to replicas before translog phase in peer recovery. This timestamp will
prevent replicas from optimizing append-only requests if retry
counterparts have been processed.

Relates #33656 
Relates #33222
2018-09-20 19:53:30 -04:00
David Turner 187f787f52
[Zen2] Introduce LeaderChecker (#33024)
It is important that follower nodes periodically check that their leader is
still healthy and that they remain part of its cluster. If these checks fail
repeatedly then followers should attempt to find and join a new leader,
possibly electing one in the process. The LeaderChecker, introduced in this
commit, performs these periodic checks and deals with retries.
2018-09-20 20:05:55 +01:00
Nhat Nguyen 76a1a863e3
TEST: stop assertSeqNos if shards movement (#33875)
Currently, assertSeqNos assumes that the cluster is stable at the end of
the test (i.e., no more shard movement). However, this assumption does
not always hold. In these cases, we can stop the assertion instead of
failing a test.

Closes #33704
2018-09-20 13:44:26 -04:00
David Turner 0b4a6ae97c Merge commit '3522b9084b611c89ec4f06c1863542883840ed0e' into zen2 2018-09-20 15:17:47 +01:00
Tim Vernum ff934e3dcd Mute broken test on MacOS
Seems to be triggered by 0cf0d73
See: https://github.com/elastic/elasticsearch/issues/33879
2018-09-20 14:06:40 +10:00
Nik Everett 26c4f1fb6c
Core: Default node.name to the hostname (#33677)
Changes the default of the `node.name` setting to the hostname of the
machine on which Elasticsearch is running. Previously it was the first 8
characters of the node id. This had the advantage of producing a unique
name even when the node name isn't configured but the disadvantage of
being unrecognizable and not being available until fairly late in the
startup process. Of particular interest is that it isn't available until
after logging is configured. This forces us to use a volatile read
whenever we add the node name to the log.

Using the hostname is available immediately on startup and is generally
recognizable but has the disadvantage of not being unique when run on
machines that don't set their hostname or when multiple elasticsearch
processes are run on the same host. I believe that, taken together, it
is better to default to the hostname.

1. Running multiple copies of Elasticsearch on the same node is a fairly
advanced feature. We do it all the as part of the elasticsearch build
for testing but we make sure to set the node name then.
2. That the node.name defaults to some flavor of "localhost" on an
unconfigured box feels like it isn't going to come up too much in
production. I expect most production deployments to at least set the
hostname.

As a bonus, production deployments need no longer set the node name in
most cases. At least in my experience most folks set it to the hostname
anyway.
2018-09-19 15:21:29 -04:00
Nik Everett 3ede13a454
Test framework fall cleaning (#33423)
Wraps all lines in our test framework at 140 characters because that is
our standard line length and removes all of the checkstyle suppressions
for the test framework.

Drops most of `ModuleTestCase` because it isn't used and we're moving
away from using guice in the way that it wants to test anyway. Also
switches a few classes that extend it but don't use it to extend
`ESTestCase` instead.
2018-09-19 14:34:02 -04:00
Yannick Welsch 10009434bf Merge remote-tracking branch 'elastic/master' into zen2 2018-09-19 11:18:01 +02:00
Vladimir Dolzhenko a3e8b831ee
add elasticsearch-shard tool (#32281)
Relates #31389
2018-09-19 10:28:22 +02:00