Commit Graph

44988 Commits

Author SHA1 Message Date
Ioannis Kakavas 401226fc90 Mute rolling upgrade watcher CRUD tests (#39293)
This fails on old_cluster but mixed_cluster and upgraded_cluster
depend on watches set in old_cluster so that can't be muted on its
own

Relates: https://github.com/elastic/elasticsearch/issues/33185
2019-02-22 13:27:45 +02:00
Daniel Mitterdorfer 42e0d63b36 Align generated release notes with doc standards (#39234)
With this commit we change the script that generates the release notes
for Elasticsearch to align with the docs standards in the following
aspects:

* The release notes title changes to include the product name
* The link text to the breaking changes page is explicitly defined

Relates #39155
2019-02-22 07:41:16 +01:00
Lee Hinman 3401afdf35 Add ILM plugin for MonitoringIT tests (#39271)
Without this, when creating the watch history indices they complain about there
being no such setting as `index.lifecycle.name`.

Relates to #38805
2019-02-21 21:45:43 -07:00
Julie Tibshirani 29243f7001 Avoid using TimeWarp in TransformIntegrationTests. (#39277)
This commit makes `TransformIntegrationTests` into a standard integration test, as
opposed to using `TimeWarp`, which registers the mock component
`ScheduleEngineTriggerMock` to trigger watches.

The simplification may help with flakiness we've observed `TimeWarp, as in #37882.
2019-02-21 18:02:44 -08:00
Jay Modi 697911c31d
Fixed missed stopping of SchedulerEngine (#39193)
The SchedulerEngine is used in several places in our code and not all
of these usages properly stopped the SchedulerEngine, which could lead
to test failures due to leaked threads from the SchedulerEngine. This
change adds stopping to these usages in order to avoid the thread leaks
that cause CI failures and noise.

Closes #38875
2019-02-21 14:31:33 -07:00
Tim Brooks 44df76251f
Rebuild remote connections on profile changes (#39146)
Currently remote compression and ping schedule settings are dynamic.
However, we do not listen for changes. This commit adds listeners for
changes to those two settings. Additionally, when those settings change
we now close existing connections and open new ones with the settings
applied.

Fixes #37201.
2019-02-21 14:00:39 -07:00
Benjamin Trent 34d06471c3
[CI] Mute CcrRetentionLeaseIT.testRetentionLeaseIsRenewedDuringRecovery (#39270) 2019-02-21 14:17:03 -06:00
Benjamin Trent 8072543428
Muting AutoFollowIT.testAutoFollowManyIndices (#39265) 2019-02-21 13:43:09 -06:00
Zachary Tong 8af0e7c4b6
Only create final MatrixStatsResults on final reduction (#39205)
MatrixStatsResults is the "final" result object, and runs an additional
computation in it's ctor to calculate covariance, etc.  This means
it should only run on the final reduction instead of on every reduce.
2019-02-21 14:18:45 -05:00
Jason Tedor b9f8be6968
Clarify the use of sleep in CCR test
Sleeps in tests smell funny, and we try to avoid them to the extent
possible. We are using a small one in a CCR test. This commit clarifies
the purpose of that sleep by adding a comment explaining it. We also
removed a hard-coded value from the test, that if we ever modified the
value higher up where it was set, we could end up forgetting to change
the value here. Now we ensure that these would move in lock step if we
ever maintain them later.
2019-02-21 14:05:48 -05:00
Jason Tedor 719c38a36d
Fix CCR tests that manipulate transport requests
We have some CCR tests where we use mock transport send rules to control
the behavior that we desire in these tests. Namely, we want to simulate
an exception being thrown on the leader side, or a variety of other
situations. These send rules were put in place between the data nodes on
each side. However, it might not be the case that these requests are
being sent between data nodes. For example, a request that is handled on
a non-data master node would not be sent from a data node. And it might
not be the case that the request is sent to a data node, as it could be
proxied through a non-data coordinating node. This commit addresses this
by putting these send rules in places between all nodes on each side.

Closes #39011
Closes #39201
2019-02-21 12:26:09 -05:00
Tanguy Leroux fc896e452c
ReadOnlyEngine should update translog recovery state information (#39238) (#39251)
`ReadOnlyEngine` never recovers operations from translog and never 
updates translog information in the index shard's recovery state, even 
though the recovery state goes through the `TRANSLOG` stage during 
the recovery. It means that recovery information for frozen shards indicates 
an unkown number of recovered translog ops in the Recovery APIs 
(translog_ops: `-1` and translog_ops_percent: `-1.0%`) and this is confusing.

This commit changes the `recoverFromTranslog()` method in `ReadOnlyEngine` 
so that it always recover from an empty translog snapshot, allowing the recovery 
state translog information to be correctly updated.

Related to #33888
2019-02-21 18:08:06 +01:00
Mark Vieira 0f68814e04
Ensure global test seed is used for all random testing tasks (#38991) (#39195)
This commit fixes a bug which resulted in every RandomizedTestingTask
generating its own unique test seed rather than using the global test
seed generated by the the RandomizedTestingPlugin.

The issue didn't affect build invocations which explicitly ran with
-Dtest.seed. In those cases, the Ant junit task correctly uses the
global seed.

Since the Ant random testing target looks for an Ant project property
for determining the existing seed we now explicitly set this inside
RandomizedTestingPlugin. It just so happens that, incidentally, Ant will
inherit system properties, thus wy the -Dtest.seed mechanism continued
to work.
2019-02-21 08:52:14 -08:00
Christoph Büscher 4b77d0434a Remove `nGram` and `edgeNGram` token filter names (#39070)
In #30209 we deprecated the camel case `nGram` filter name in favour of `ngram` and
did the same for `edgeNGram` and `edge_ngram` and we are removing those names in
8.0. This change disallows using the deprecated names for new indices created in 7.0 by
throwing an error if these filters are used.

Relates to #38911
2019-02-21 16:55:40 +01:00
Albert Zaharovits 08ad740d48 Mute test (#39248)
Mute test DissectParserTests.testBasicMatchUnicode
2019-02-21 17:36:01 +02:00
Armin Braun 1a21cc0357
Simplify and Fix Synchronization in InternalTestCluster (#39168) (#39241)
* Simplify and Fix Synchronization in InternalTestCluster (#39168)

* Remove unnecessary `synchronized` statements
* Make `Predicate`s constants where possible
* Cleanup some stream usage
* Make unsafe public methods `synchronized`
* Closes #37965
* Closes #37275
* Closes #37345
2019-02-21 16:27:18 +01:00
Lee Hinman d9de899316 Wrap accounting breaker check in assertBusy (#39211)
There may be situations where indices have not yet been closed from a Lucene
perspective, causing the breaker to not immediately be at 0

Relates to #30290
2019-02-21 08:00:31 -07:00
Daniel Mitterdorfer ef921fd157
Migrate Streamable to Writeable for cluster block package (#37391) (#39236) 2019-02-21 15:21:44 +01:00
Marios Trivyzas ecfd48b6d3
[Tests] Make testEngineGCDeletesSetting deterministic (#38942) (#39231)
`InternalEngine.resolveDocVersion()` uses `relativeTimeInMillis()` from
`ThreadPool` so it needs, the cached time to be advanced. Add a check
to ensure that and decrease the `thread_pool.estimated_time_interval`
to 1msec to prevent long running times for the test.

Fixes: #38874

Co-authored-by: Boaz Leskes <b.leskes@gmail.com>
2019-02-21 14:30:59 +02:00
Marios Trivyzas 1316825f52
Replace superfluous usage of Counter with Supplier (#39048) (#39225)
`Counter` was used as a means of a functional argument to pass
the relative cached time before `Supplier` iface was introduced.
2019-02-21 12:42:54 +02:00
Ignacio Vera be8a5315d7 Extend nextDoc to delegate to the wrapped doc-value iterator for date_nanos (#39176)
The type date_nanos does not direct doc-value iterators and it needs to extend `next_doc` in order to delegate the call to the wrapped iterator.
2019-02-21 11:10:51 +01:00
Martijn van Groningen f40139c403
Change ShardFollowTask to reuse common serialization logic (#39094)
Initially in #38910, ShardFollowTask was reusing ImmutableFollowParameters'
serialization logic. After merging, bwc tests failed sometimes and
the binary serialization that ShardFollowTask was originally was using
was added back. ImmutableFollowParameters is using optional fields (optional vint)
while ShardFollowTask was not (vint).
2019-02-21 09:32:33 +01:00
Martijn van Groningen c28e1b2299
Disable bwc tests for #39094 2019-02-21 09:00:38 +01:00
Tal Levy b854c92696
add missing `test2` index in alias example (#39212) (#39214)
* missing 'test2' index example (#39055)

If I got the idea of aliases properly, I think that the index "test2" should have
a reference in the example above of the following sentence:

" ... we associate the alias `alias1` to both `test` and `test2` ... "

* add PUT test2

* Update aliases.asciidoc

swap which is write/read

Co-authored-by: Guilherme Ferreira <guilhermeaferreira_t@yahoo.com.br>
2019-02-20 17:45:24 -08:00
Tal Levy 8150ca40f2 mute test3MasterNodes2Failed 2019-02-20 17:35:37 -08:00
Ryan Ernst 6e7b643775
Deprecate fallback to java on PATH (#37990)
Finding java on the path is sometimes confusing for users and
unexpected, as well as leading to a different java being used than a
user expects.  This commit adds warning messages when starting
elasticsearch (or any tools like the plugin cli) and using java found
on the PATH instead of via JAVA_HOME.
2019-02-20 17:07:11 -08:00
Nhat Nguyen a96df5d209 Reduce refresh when lookup term in FollowingEngine (#39184)
Today we always refresh when looking up the primary term in
FollowingEngine. This is not necessary for we can simply
return none for operations before the global checkpoint.
2019-02-20 19:21:00 -05:00
Nhat Nguyen cdec11c4eb Relax history check in ShardFollowTaskReplicationTests (#39162)
The follower won't always have the same history as the leader for its
soft-deletes retention can be different. However, if some operation
exists on the history of the follower, then the same operation must
exist on the leader. This change relaxes the history check in
ShardFollowTaskReplicationTests.

Closes #39093
2019-02-20 19:21:00 -05:00
Nhat Nguyen 820ba8169e Add retention leases replication tests (#38857)
This commit introduces the retention leases to ESIndexLevelReplicationTestCase,
then adds some tests verifying that the retention leases replication works
correctly in spite of the presence of the primary failover or out of order
delivery of retention leases sync requests.

Relates #37165
2019-02-20 19:21:00 -05:00
Tal Levy 0b676f07f6 mute failing test in 20_mix_typeless_typeful
awaits fix in #39198
2019-02-20 16:12:07 -08:00
Mirko Jotic a6ae146ccc Converting Derivative Pipeline Agg integration test into AggregatorTestsCase. (#38679)
Replicates the majority of existing Derivative pipeline integration tests into
an AggregatorTestCase, with the goal of removing the integration
tests in the near future.
2019-02-20 16:35:32 -05:00
Lisa Cawley bfc9a00d9f [DOCS] Unset machine learning upgrade mode (#39149) 2019-02-20 12:06:27 -08:00
Mark Vieira 24ac9da276
Mute CCR retention test that is consistently failing locally and in CI 2019-02-20 11:57:46 -08:00
Tal Levy cb7e3708bc
Rollup jobs should be cleaned up before indices are deleted (#38930) (#39144)
Rollup jobs should be stopped + deleted before the indices are removed.
It's possible for an active rollup job to issue a bulk request, the test
ends and the cleanup code deletes all indices.  The in-flight bulk
request will then stall + error because the index no-longer exists...
but this process might take longer than the StopRollup timeout.

Which means the test fails, and often fails several other tests since
the job is still active (e.g. other tests cannot create the same-named
job, or fail to stop the job in their cleanup because it's still stalled).

This tends to knock over several tests before the bulk finally times
out and the job shuts down.

Instead, we need to simply stop jobs first.  Inflight bulks will resolve
quickly, and we can carry on with deleting indices after the jobs are
confirmed inactive.

stop-job.asciidoc tended to trigger this issue because it executed
an async stop API and then exited, which setup the above situation. In
can and did happen with other tests though.  As an extra precaution,
the doc test was modified to substitute in wait_for_completion
to help head off these issues too.
2019-02-20 11:12:01 -08:00
Jay Modi af451459a5
Fix failures in SessionFactoryLoadBalancingTests (#39154)
This change aims to fix failures in the session factory load balancing
tests that mock failure scenarios. For these tests, we randomly shut
down ldap servers and bind a client socket to the port they were
listening on. Unfortunately, we would occasionally encounter failures
in these tests where a socket was already in use and/or the port
we expected to connect to was wrong and in fact was to one of the ldap
instances that should have been shut down.

The failures are caused by the behavior of certain operating systems
when it comes to binding ports and wildcard addresses. It is possible
for a separate application to be bound to a wildcard address and still
allow our code to bind to that port on a specific address. So when we
close the server socket and open the client socket, we are still able
to establish a connection since the other application is already
listening on that port on a wildcard address. Another variant is that
the os will allow a wildcard bind of a server socket when there is
already an application listening on that port for a specific address.

In order to do our best to prevent failures in these scenarios, this
change does the following:

1. Binds a client socket to all addresses in an awaitBusy
2. Adds assumption that we could bind all valid addresses
3. In the case that we still establish a connection to an address that
   we should not be able to, try to bind and expect a failure of not
   being connected

Closes #32190
2019-02-20 11:38:26 -07:00
Igor Motov 3d93011e32 Fix median calculation in MedianAbsoluteDeviationAggregatorTests (#38979)
Fixes an error in median calculation in
MedianAbsoluteDeviationAggregatorTests for odd number of sample points,
which causes some rare test failures.

Fixes #38937
2019-02-20 13:24:30 -05:00
Jason Tedor 90b1b36f50
Add cleanup logic to CCR retention lease test
This commit adds some logic to remove the mock transport rules at the
end of a CCR retention lease test.
2019-02-20 13:20:07 -05:00
Jason Tedor 224600f370
Bump jackson-databind version for AWS SDK (#39183)
This commit bumps the jackson-databind version for discovery-ec2 and
repository-s3 to 2.8.11.3.
2019-02-20 13:04:50 -05:00
Jason Tedor cfd7c77b64
Fix broken CCR retention lease unfollow test
This commit fixes a broken CCR retention lease unfollow test. The
problem with the test is that the random subset of shards that we picked
to disrupt would not necessarily overlap with the actual shards in
use. We could take a non-empty subset of [0, 3] (e.g., { 2 }) when the
only shard IDs in use were [0, 1]. This commit fixes this by taking into
account the number of shards in use in the test.

With this change, we also take measure to ensure that a successful
branch is tested more frequently than would otherwise be the case. On
that branch, we want to sometimes pretend that the retention lease is
already removed. The randomness here was also sometimes selecting a
subset of shards that did not overlap with the shards actually in use
during the test. While this does not break the test, it is confusing and
reduces the amount of coverage of that branch.

Relates #39185
2019-02-20 12:09:28 -05:00
Lisa Cawley d74e25a778 [DOCS] Edits the remote clusters documentation (#38996) 2019-02-20 09:01:16 -08:00
Tal Levy f1c8aa816f fix dangling tag in TasksClientDocumentationIT (#39157)
fix dangling tag in TasksClientDocumentationIT
and fix tag mentions from list-tasks to cancel-tasks
2019-02-20 08:48:58 -08:00
Jason Tedor 751c05eff9
Bump jackson-databind version for ingest-geoip (#39182)
This commit bumps the jackson-databind version for ingest-geoip to
2.8.11.3.
2019-02-20 11:40:31 -05:00
James Baiera fe9b89ef97
Add an exception throw if waiting on transport port file fails (#37574)
In the ClusterConfiguration class of the build source, there is an Ant waitfor block 
that runs to ensure that the seed node's transport ports file is created before 
trying to read it. If the wait times out, the file read fails with not much helpful info. 
This just adds a timeout property to the waitfor block and throws a descriptive 
exception instead.

Backports #37574
2019-02-20 11:09:43 -05:00
Albert Zaharovits af8ef1bb98 Do not create the missing index when invoking getRole (#39039)
In most of the places we avoid creating the `.security` index (or updating the mapping)
for read/search operations. This is more of a nit for the case of the getRole call,
that fixes a possible mapping update during a get role, and removes a dead if branch
about creating the `.security` index.
2019-02-20 17:33:10 +02:00
Martijn van Groningen 0594a467f2
Add support for ccr follow info api to HLRC. (#39115)
This API was introduces after #33824 was closed.
2019-02-20 16:10:54 +01:00
Jason Tedor 48984f647d
Mute failing CCR retention lease unfollow test
This commit mutes a CCR retention lease unfollow test that is failing
randomly, but frequently.
2019-02-20 09:47:17 -05:00
Ioannis Kakavas c783069804
Fix NPE on Stale Index in IndicesService(#39173)
This is a backport of  #38891 which closes #38845
2019-02-20 15:35:35 +02:00
Jason Tedor 09ea3ccd16
Remove retention leases when unfollowing (#39088)
This commit attempts to remove the retention leases on the leader shards
when unfollowing an index. This is best effort, since the leader might
not be available.
2019-02-20 07:06:49 -05:00
Darren Meiss eae2c9dd5c Edits to text in Phrase Suggester doc (#38966) 2019-02-20 12:35:21 +01:00
Darren Meiss 27f7ff157b Fix a typo in the cat shards API docs (#38536) 2019-02-20 11:36:46 +01:00