Commit Graph

2896 Commits

Author SHA1 Message Date
Jason Tedor e569cf8324
Address failing CCR retention lease test
This test fails rarely but it is flaky in its current form. The problem
here is that we lack a guarantee on the retention leases having been
synced to all shard copies. We need to sleep long enough to ensure that
that occurs, and then we can sample the retention leases, possibly sleep
again (we usually will not have too since the first sleep will have been
long enough to allow a sync and a renewal to happen, if one was going to
happen), and the sample the retention leases for comparison.

Closes #39331
2019-02-22 18:15:10 -05:00
Jason Tedor e4e96b8181
Fix shard logged in background lease renewal
The shard logged here is the leader shard but it should be the follower
shard since this background retention lease renewal is happening on the
follower side. This commit fixes that.
2019-02-22 17:32:51 -05:00
Jason Tedor feb25c71a0
Simplify mocking in CCR retention lease tests
This commit simplifies the use of transport mocking in the CCR retention
lease integration tests. Instead of adding a send rule between nodes, we
add a default send rule. This greatly simplifies the code here, and
speeds the test up a little bit too.
2019-02-22 17:24:12 -05:00
Tim Brooks 931953a3ee
Ensure index commit released when testing timeouts (#39273)
This fixes #39245. Currently it is possible in this test that the clear
session call times-out. This means that the index commit will not be
released and there will be an assertion triggered in the test teardown.
This commit ensures that we wipe the leader index in the test to avoid
this assertion.

It is okay if the clear session call times-out in normal usage. This
scenario is unavoidable due to potential network issues. We have a local
timeout on the leader to clean it up when this scenario happens.
2019-02-22 11:14:42 -07:00
Benjamin Trent 3262d6c917
[ML-DataFrame] Add _preview endpoint (#38924) (#39319)
* [DATA-FRAME] add preview endpoint

* adjusting preview tests and fixing parser

* adjusing preview transport

* remove unused import

* adjusting test

* Addressing PR comments

* Fixing failing test and adjusting for pr comments

* fixing integration test
2019-02-22 10:55:38 -06:00
David Roberts 4f2bd238d2 [ML] Increase datafeed integration test timeout for slow machines (#39311)
The assertBusy() that waits the default 10 seconds for a
datafeed to complete very occasionally times out on slow
machines.  This commit increases the timeout to 60 seconds.
It will almost never actually take this long, but it's
better to have a timeout that will prevent time being
wasted looking at spurious test failures.
2019-02-22 15:35:32 +00:00
Gordon Brown 2ad1e6aedc
Fix testCannotShrinkLeaderIndex (#38529)
This test should no longer pass when the functionality it is intended to
test is broken, as it now indexes a number of documents and verifies
that the index is staying on the same step until after indexing and
replication of those documents is finished. This prevents the test from
passing if the leader index progresses in its lifecycle during that time.
2019-02-22 08:03:36 -07:00
Dimitris Athanasiou 1c6818fe74
[ML] Improve DeleteExpiredDataIT failure message (#39298) (#39310)
This test failed once in a very long time with the assertion
that there is no document for the `non_existing_job` in the
state index. I could not see how that is possible and I cannot
reproduce. With this commit the failure message will reveal
some examples of the left behind docs which might shed a light
about what could go wrong.
2019-02-22 16:15:11 +02:00
Daniel Mitterdorfer 9fea21aca5
Remove ExceptionsHelper#detailedMessage in tests (#37921) (#39297)
With this commit we remove all usages of the deprecated method
`ExceptionsHelper#detailedMessage` in tests. We do not address
production code here but rather in dedicated follow-up PRs to keep the
individual changes manageable.

Relates #19069
2019-02-22 14:03:29 +01:00
Ioannis Kakavas 401226fc90 Mute rolling upgrade watcher CRUD tests (#39293)
This fails on old_cluster but mixed_cluster and upgraded_cluster
depend on watches set in old_cluster so that can't be muted on its
own

Relates: https://github.com/elastic/elasticsearch/issues/33185
2019-02-22 13:27:45 +02:00
Lee Hinman 3401afdf35 Add ILM plugin for MonitoringIT tests (#39271)
Without this, when creating the watch history indices they complain about there
being no such setting as `index.lifecycle.name`.

Relates to #38805
2019-02-21 21:45:43 -07:00
Julie Tibshirani 29243f7001 Avoid using TimeWarp in TransformIntegrationTests. (#39277)
This commit makes `TransformIntegrationTests` into a standard integration test, as
opposed to using `TimeWarp`, which registers the mock component
`ScheduleEngineTriggerMock` to trigger watches.

The simplification may help with flakiness we've observed `TimeWarp, as in #37882.
2019-02-21 18:02:44 -08:00
Jay Modi 697911c31d
Fixed missed stopping of SchedulerEngine (#39193)
The SchedulerEngine is used in several places in our code and not all
of these usages properly stopped the SchedulerEngine, which could lead
to test failures due to leaked threads from the SchedulerEngine. This
change adds stopping to these usages in order to avoid the thread leaks
that cause CI failures and noise.

Closes #38875
2019-02-21 14:31:33 -07:00
Tim Brooks 44df76251f
Rebuild remote connections on profile changes (#39146)
Currently remote compression and ping schedule settings are dynamic.
However, we do not listen for changes. This commit adds listeners for
changes to those two settings. Additionally, when those settings change
we now close existing connections and open new ones with the settings
applied.

Fixes #37201.
2019-02-21 14:00:39 -07:00
Benjamin Trent 34d06471c3
[CI] Mute CcrRetentionLeaseIT.testRetentionLeaseIsRenewedDuringRecovery (#39270) 2019-02-21 14:17:03 -06:00
Benjamin Trent 8072543428
Muting AutoFollowIT.testAutoFollowManyIndices (#39265) 2019-02-21 13:43:09 -06:00
Jason Tedor b9f8be6968
Clarify the use of sleep in CCR test
Sleeps in tests smell funny, and we try to avoid them to the extent
possible. We are using a small one in a CCR test. This commit clarifies
the purpose of that sleep by adding a comment explaining it. We also
removed a hard-coded value from the test, that if we ever modified the
value higher up where it was set, we could end up forgetting to change
the value here. Now we ensure that these would move in lock step if we
ever maintain them later.
2019-02-21 14:05:48 -05:00
Jason Tedor 719c38a36d
Fix CCR tests that manipulate transport requests
We have some CCR tests where we use mock transport send rules to control
the behavior that we desire in these tests. Namely, we want to simulate
an exception being thrown on the leader side, or a variety of other
situations. These send rules were put in place between the data nodes on
each side. However, it might not be the case that these requests are
being sent between data nodes. For example, a request that is handled on
a non-data master node would not be sent from a data node. And it might
not be the case that the request is sent to a data node, as it could be
proxied through a non-data coordinating node. This commit addresses this
by putting these send rules in places between all nodes on each side.

Closes #39011
Closes #39201
2019-02-21 12:26:09 -05:00
Tanguy Leroux fc896e452c
ReadOnlyEngine should update translog recovery state information (#39238) (#39251)
`ReadOnlyEngine` never recovers operations from translog and never 
updates translog information in the index shard's recovery state, even 
though the recovery state goes through the `TRANSLOG` stage during 
the recovery. It means that recovery information for frozen shards indicates 
an unkown number of recovered translog ops in the Recovery APIs 
(translog_ops: `-1` and translog_ops_percent: `-1.0%`) and this is confusing.

This commit changes the `recoverFromTranslog()` method in `ReadOnlyEngine` 
so that it always recover from an empty translog snapshot, allowing the recovery 
state translog information to be correctly updated.

Related to #33888
2019-02-21 18:08:06 +01:00
Martijn van Groningen f40139c403
Change ShardFollowTask to reuse common serialization logic (#39094)
Initially in #38910, ShardFollowTask was reusing ImmutableFollowParameters'
serialization logic. After merging, bwc tests failed sometimes and
the binary serialization that ShardFollowTask was originally was using
was added back. ImmutableFollowParameters is using optional fields (optional vint)
while ShardFollowTask was not (vint).
2019-02-21 09:32:33 +01:00
Nhat Nguyen a96df5d209 Reduce refresh when lookup term in FollowingEngine (#39184)
Today we always refresh when looking up the primary term in
FollowingEngine. This is not necessary for we can simply
return none for operations before the global checkpoint.
2019-02-20 19:21:00 -05:00
Nhat Nguyen cdec11c4eb Relax history check in ShardFollowTaskReplicationTests (#39162)
The follower won't always have the same history as the leader for its
soft-deletes retention can be different. However, if some operation
exists on the history of the follower, then the same operation must
exist on the leader. This change relaxes the history check in
ShardFollowTaskReplicationTests.

Closes #39093
2019-02-20 19:21:00 -05:00
Nhat Nguyen 820ba8169e Add retention leases replication tests (#38857)
This commit introduces the retention leases to ESIndexLevelReplicationTestCase,
then adds some tests verifying that the retention leases replication works
correctly in spite of the presence of the primary failover or out of order
delivery of retention leases sync requests.

Relates #37165
2019-02-20 19:21:00 -05:00
Mark Vieira 24ac9da276
Mute CCR retention test that is consistently failing locally and in CI 2019-02-20 11:57:46 -08:00
Jay Modi af451459a5
Fix failures in SessionFactoryLoadBalancingTests (#39154)
This change aims to fix failures in the session factory load balancing
tests that mock failure scenarios. For these tests, we randomly shut
down ldap servers and bind a client socket to the port they were
listening on. Unfortunately, we would occasionally encounter failures
in these tests where a socket was already in use and/or the port
we expected to connect to was wrong and in fact was to one of the ldap
instances that should have been shut down.

The failures are caused by the behavior of certain operating systems
when it comes to binding ports and wildcard addresses. It is possible
for a separate application to be bound to a wildcard address and still
allow our code to bind to that port on a specific address. So when we
close the server socket and open the client socket, we are still able
to establish a connection since the other application is already
listening on that port on a wildcard address. Another variant is that
the os will allow a wildcard bind of a server socket when there is
already an application listening on that port for a specific address.

In order to do our best to prevent failures in these scenarios, this
change does the following:

1. Binds a client socket to all addresses in an awaitBusy
2. Adds assumption that we could bind all valid addresses
3. In the case that we still establish a connection to an address that
   we should not be able to, try to bind and expect a failure of not
   being connected

Closes #32190
2019-02-20 11:38:26 -07:00
Jason Tedor 90b1b36f50
Add cleanup logic to CCR retention lease test
This commit adds some logic to remove the mock transport rules at the
end of a CCR retention lease test.
2019-02-20 13:20:07 -05:00
Jason Tedor cfd7c77b64
Fix broken CCR retention lease unfollow test
This commit fixes a broken CCR retention lease unfollow test. The
problem with the test is that the random subset of shards that we picked
to disrupt would not necessarily overlap with the actual shards in
use. We could take a non-empty subset of [0, 3] (e.g., { 2 }) when the
only shard IDs in use were [0, 1]. This commit fixes this by taking into
account the number of shards in use in the test.

With this change, we also take measure to ensure that a successful
branch is tested more frequently than would otherwise be the case. On
that branch, we want to sometimes pretend that the retention lease is
already removed. The randomness here was also sometimes selecting a
subset of shards that did not overlap with the shards actually in use
during the test. While this does not break the test, it is confusing and
reduces the amount of coverage of that branch.

Relates #39185
2019-02-20 12:09:28 -05:00
Albert Zaharovits af8ef1bb98 Do not create the missing index when invoking getRole (#39039)
In most of the places we avoid creating the `.security` index (or updating the mapping)
for read/search operations. This is more of a nit for the case of the getRole call,
that fixes a possible mapping update during a get role, and removes a dead if branch
about creating the `.security` index.
2019-02-20 17:33:10 +02:00
Jason Tedor 48984f647d
Mute failing CCR retention lease unfollow test
This commit mutes a CCR retention lease unfollow test that is failing
randomly, but frequently.
2019-02-20 09:47:17 -05:00
Jason Tedor 09ea3ccd16
Remove retention leases when unfollowing (#39088)
This commit attempts to remove the retention leases on the leader shards
when unfollowing an index. This is best effort, since the leader might
not be available.
2019-02-20 07:06:49 -05:00
Andrei Stefan c1018db404 SQL: enforce JDBC driver - ES server version parity (#38972)
(cherry picked from commit 822a21f29491f295b22dacd04b747781a69ffa61)
2019-02-20 11:29:02 +02:00
Andrei Stefan 92206c8567 Added "validate.properties" property to JDBC's list of allowed properties. (#39050)
This defaults to "true" (current behavior) and will throw an exception
if there is a property that cannot be recognized. If "false", it will
ignore anything unrecognizable.

(cherry picked from commit 38fbf9792bcf4fe66bb3f17589e5fe6d29748d07)
2019-02-20 11:29:01 +02:00
Tim Vernum 4aa50ed348
Resolve concurrency with watcher trigger service (#39164)
The watcher trigger service could attempt to modify the perWatchStats
map simultaneously from multiple threads. This would cause the
internal state to become inconsistent, in particular the count()
method may return an incorrect value for the number of watches.

This changes replaces the implementation of the map with a
ConcurrentHashMap so that its internal state remains consistent even
when accessed from mutiple threads.

Backport of: #39092
2019-02-20 19:18:00 +11:00
Julie Tibshirani f5b28ca69d Enable test logging for TransformIntegrationTests#testSearchTransform.
There is already fairly detailed debug logging in the watcher framework, which
should hopefully help debug the failure.

Relates to #37882.
2019-02-19 18:15:34 -08:00
Tal Levy b5dbd1a027 AwaitsFix XPackUsageIT#testXPackCcrUsage.
relates to #39126.
2019-02-19 13:28:46 -08:00
Benjamin Trent 109b6451fd
ML refactor DatafeedsConfig(Update) so defaults are not populated in queries or aggs (#38822) (#39119)
* ML refactor DatafeedsConfig(Update) so defaults are not populated in queries or aggs

* Addressing pr feedback
2019-02-19 12:45:56 -06:00
Ioannis Kakavas 210f34f8e9 Remove BCryptTests (#39098)
This test was added to verify that we fixed a specific behavior in
Bcrypt and hasn't been running for almost 4 years now.
2019-02-19 18:12:18 +02:00
David Roberts 35e30b34f9 [ML] Stop the ML memory tracker before closing node (#39111)
The ML memory tracker does searches against ML results
and config indices.  These searches can be asynchronous,
and if they are running while the node is closing then
they can cause problems for other components.

This change adds a stop() method to the MlMemoryTracker
that waits for in-flight searches to complete.  Once
stop() has returned the MlMemoryTracker will not kick
off any new searches.

The MlLifeCycleService now calls MlMemoryTracker.stop()
before stopping stopping the node.

Fixes #37117
2019-02-19 15:12:40 +00:00
David Roberts bbcdea43c5 [ML] Allow stop unassigned datafeed and relax unset upgrade mode wait (#39034)
These two changes are interlinked.

Before this change unsetting ML upgrade mode would wait for all
datafeeds to be assigned and not waiting for their corresponding
jobs to initialise.  However, this could be inappropriate, if
there was a reason other that upgrade mode why one job was unable
to be assigned or slow to start up.  Unsetting of upgrade mode
would hang in this case.

This change relaxes the condition for considering upgrade mode to
be unset to simply that an assignment attempt has been made for
each ML persistent task that did not fail because upgrade mode
was enabled.  Thus after unsetting upgrade mode there is no
guarantee that every ML persistent task is assigned, just that
each is not unassigned due to upgrade mode.

In order to make setting upgrade mode work immediately after
unsetting upgrade mode it was then also necessary to make it
possible to stop a datafeed that was not assigned.  There was
no particularly good reason why this was not allowed in the past.
It is trivial to stop an unassigned datafeed because it just
involves removing the persistent task.
2019-02-19 14:07:10 +00:00
Martijn van Groningen c8d59f6f0f
Fix shard follow task startup error handling (#39053)
Prior to this commit, if during fetch leader / follower GCP
a fatal error occurred, then the shard follow task was removed.

This is unexpected, because if such an error occurs during the lifetime of shard follow task then replication is stopped and the fatal error flag is set. This allows the ccr stats api to report the fatal exception that has occurred (instead of the user grepping through the elasticsearch logs).

This issue was found by a rare failure of the  `FollowStatsIT#testFollowStatsApiIncludeShardFollowStatsWithRemovedFollowerIndex` test.

Closes #38779
2019-02-19 08:54:02 +01:00
Ioannis Kakavas 59e9a0f4f4 Disable specific locales for tests in fips mode (#38938)
* Disable specific locales for tests in fips mode

The Bouncy Castle FIPS provider that we use for running our tests
in fips mode has an issue with locale sensitive handling of Dates as
described in https://github.com/bcgit/bc-java/issues/405

This causes certificate validation to fail if any given test that
includes some form of certificate validation happens to run in one
of the locales. This manifested earlier in #33081 which was
handled insufficiently in #33299

This change ensures that the problematic 3 locales

* th-TH
* ja-JP-u-ca-japanese-x-lvariant-JP
* th-TH-u-nu-thai-x-lvariant-TH

will not be used when running our tests in a FIPS 140 JVM. It also
reverts #33299
2019-02-19 08:46:08 +02:00
Jason Tedor 2d8f6b6501
Introduce retention lease state file (#39004)
This commit moves retention leases from being persisted in the Lucene
commit point to being persisted in a dedicated state file.
2019-02-18 16:53:46 -05:00
Martijn van Groningen ce412908ed
also check ccr stats api return empty response in ensureNoCcrTasks()
If this fails then it returns more detailed information, for example
fatal error.
2019-02-18 16:15:22 +01:00
Nhat Nguyen 2947ccf5c3 Add remote recovery to ShardFollowTaskReplicationTests (#39007)
We simulate remote recovery in ShardFollowTaskReplicationTests 
by bootstrapping the follower with the safe commit of the leader.

Relates #35975
2019-02-18 09:57:56 -05:00
Hendrik Muhs 1efb01661c
set minimum supported version (#39043) (#39051)
change the minimum supported version of data frame transform
2019-02-18 15:41:25 +01:00
Martijn van Groningen 4fd1f8048d
Mute test #38949 2019-02-18 15:24:07 +01:00
David Roberts b660d2cac6 [ML] More advanced post-test cleanup of ML indices (#39049)
The .ml-annotations index is created asynchronously when
some other ML index exists.  This can interfere with the
post-test index deletion, as the .ml-annotations index
can be created after all other indices have been deleted.

This change adds an ML specific post-test cleanup step
that runs before the main cleanup and:

1. Checks if any ML indices exist
2. If so, waits for the .ml-annotations index to exist
3. Deletes the other ML indices found in step 1.
4. Calls the super class cleanup

This means that by the time the main post-test index
cleanup code runs:

1. The only ML index it has to delete will be the
   .ml-annotations index
2. No other ML indices will exist that could trigger
   recreation of the .ml-annotations index

Fixes #38952
2019-02-18 14:16:03 +00:00
Martijn van Groningen e8ea85d6e9
wait for shard to be allocated before executing a resume follow api 2019-02-18 14:50:40 +01:00
Martijn Laarman 9b4d96534b
Fix #38623 remove xpack namespace REST API (#38625) (#39036)
* Fix #38623 remove xpack namespace REST API

Except for xpack.usage and xpack.info API's, this moves the last remaining API's out of the xpack namespace

* rename xpack api's inside inside the files as well

* updated yaml tests references to xpack namespaces api's

* update callsApi calls in the IT subclasses

* make sure docs testing does not use xpack namespaced api's

* fix leftover xpack namespaced method names in docs/build.gradle

* found another leftover reference

(cherry picked from commit ccb5d934363c37506b76119ac050a254fa80b5e7)
2019-02-18 12:40:07 +01:00
Martijn van Groningen 9aa542fb1b
Mute test
Relates to #38779
2019-02-18 12:02:52 +01:00
Hendrik Muhs 4f662bd289
Add data frame feature (#38934) (#39029)
The data frame plugin allows users to create feature indexes by pivoting a source index. In a
nutshell this can be understood as reindex supporting aggregations or similar to the so called entity
centric indexing.

Full history is provided in: feature/data-frame-transforms
2019-02-18 11:07:29 +01:00
Martijn van Groningen ed08bc3537
Fix LocalIndexFollowingIT#testRemoveRemoteConnection() test (#38709)
* During fetching remote mapping if remote client is missing then
`NoSuchRemoteClusterException` was not handled.
* When adding remote connection, check that it is really connected
before continue-ing to run the tests.

Relates to #38695
2019-02-18 09:41:44 +01:00
Martijn van Groningen 03b2ec6ee6
Test bi-directional index following during a rolling upgrade. (#38962)
Follow index in follow cluster that follows an index in the leader cluster and another
follow index in the leader index that follows that index in the follow cluster.

During the upgrade index following is paused and after the upgrade
index following is resumed and then verified index following works as expected.

Relates to #38037
2019-02-18 09:06:58 +01:00
Nhat Nguyen 204480d818 Mute testRetentionLeaseIsRenewedDuringRecovery
Tracked at #39011
2019-02-17 15:34:51 -05:00
Jason Tedor a5ce1e0bec
Integrate retention leases to recovery from remote (#38829)
This commit is the first step in integrating shard history retention
leases with CCR. In this commit we integrate shard history retention
leases with recovery from remote. Before we start transferring files, we
take out a retention lease on the primary. Then during the file copy
phase, we repeatedly renew the retention lease. Finally, when recovery
from remote is complete, we disable the background renewing of the
retention lease.
2019-02-16 15:37:52 -05:00
Tim Brooks b1c1daa63f
Add get file chunk timeouts with listener timeouts (#38758)
This commit adds a `ListenerTimeouts` class that will wrap a
`ActionListener` in a listener with a timeout scheduled on the generic
thread pool. If the timeout expires before the listener is completed,
`onFailure` will be called with an `ElasticsearchTimeoutException`.

Timeouts for the get ccr file chunk action are implemented using this
functionality. Additionally, this commit attempts to fix #38027 by also
blocking proxied get ccr file chunk actions. This test being un-muted is
useful to verify the timeout functionality.
2019-02-16 10:56:03 -07:00
Jason Tedor d80325f288
Mark fail over on follower test as awaits fix
This test is failing since the introduction of recovery from
remote. This commit marks this test as awaits fix.
2019-02-16 12:28:16 -05:00
Nhat Nguyen 7e20a92888 Advance max_seq_no before add operation to Lucene (#38879)
Today when processing an operation on a replica engine (or the 
following engine), we first add it to Lucene, then add it to translog, 
then finally marks its seq_no as completed. If a flush occurs after step1,
but before step-3, the max_seq_no in the commit's user_data will be
smaller than the seq_no of some documents in the Lucene commit.
2019-02-15 21:04:28 -05:00
Nhat Nguyen 20755e666c Reduce global checkpoint sync interval in disruption tests (#38931)
We verify seq_no_stats is aligned between copies at the end of some
disruption tests. Sometimes, the assertion `assertSeqNos` is tripped due
to a lagged global checkpoint on replicas. The global checkpoint on
replicas is lagged because we sync the global checkpoint 30 seconds (by
default) after the last replication operation. This change reduces the
global checkpoint sync-internal to 1s in the disruption tests.

Closes #38318
Closes #36789
2019-02-15 21:04:20 -05:00
Jason Tedor 58551198d5
Address some CCR REST test case flakiness (#38975)
The CCR REST tests that rely on these assertions are flaky. They are
flaky since the introduction of recovery from the remote.

The underlying problem is this: these tests are making assertions about
the number of operations read by the shard following task. However, with
recovery from remote, we no longer have guarantees that the assumptions
these tests were relying on hold. Namely, these tests were assuming that
the only way that a document could land in the follower index is via the
shard following task. With recovery from remote, there is another way,
which is via the files that are copied over during the recovery
phase. Most of the time this will not be a problem because with the
small number of documents that we are indexing in these tests, it is
usally not the case that a flush would occur and so there would not be
any documents in the files copied over. However, a flush can occur any
time at which point all of the indexed documents could end up in a safe
commit and copied over during recovery from remote. This commit modifies
these assertions to ones that are not prone to this issue, yet still
validate the health of the follower shard.
2019-02-15 16:01:02 -05:00
Lisa Cawley 339a15bb09 [DOCS] Edits warning in put watch API (#38582) 2019-02-15 09:40:12 -08:00
Martijn van Groningen 03b67b3ee1
Introduced class reuses follow parameter code between ShardFollowTasks (#38910)
and AutoFollowPattern classes.

The ImmutableFollowParameters is like the already existing FollowParameters,
but all of its fields are final.
2019-02-15 18:26:15 +01:00
iverase b19b778cbb [CI] Muting method testFollowIndex in IndexFollowingIT
Relates to #38949
2019-02-15 16:07:45 +01:00
Yogesh Gaikwad 36c274867e
Fix intermittent failure in ApiKeyIntegTests (#38627) (#38935)
Few tests failed intermittently and most of the
times due to invalidated or expired keys that were
deleted were still reported in search results.
This commit removes the test and adds enhancements
to other tests testing different scenario's.

When ExpiredApiKeysRemover is triggered, the tests
did not await its termination thereby sometimes
the results would be wrong for a search operation.

DELETE_INTERVAL setting has been further reduced to
100ms so we can trigger ExpiredApiKeysRemover faster.

Closes #38408
2019-02-15 23:01:35 +11:00
Martijn van Groningen 60cc04ed13
Migrate muted auto follow rolling upgrade test and unmute this test (#38900)
The rest of `CCRIT` is now no longer relevant, because the remaining
test tests the same of the index following test in the rolling upgrade
multi cluster module.

Added `tests.upgrade_from_version` version to test. It is not needed
in this branch, but is in 6.7 branch.

Closes #37231
2019-02-15 11:25:13 +01:00
Yannick Welsch d55e52223f Smarter CCR concurrent file chunk fetching (#38841)
The previous logic for concurrent file chunk fetching did not allow for multiple chunks from the same
file to be fetched in parallel. The parallelism only allowed to fetch chunks from different files in
parallel. This required complex logic on the follower to be aware from which file it was already
fetching information, in order to ensure that chunks for the same file would be fetched in sequential
order. During benchmarking, this exhibited throughput issues when recovery came towards the end,
where it would only be sequentially fetching chunks for the same largest segment file, with
throughput considerably going down in a high-latency network as there was no parallelism anymore.

The new logic here follows the peer recovery model more closely, and sends multiple requests for
the same file in parallel, and then reorders the results as necessary. Benchmarks show that this
leads to better overall throughput and the implementation is also simpler.
2019-02-15 07:51:58 +01:00
Shaunak Kashyap 1f74ba2d33
[Monitoring] Remove `include_type_name` parameter from GET _template request (#38925)
Backport of #38818 to `7.x`. Original description:

The HTTP exporter code in the Monitoring plugin makes `GET _template` requests to check for existence of templates. These requests don't need to pass the `include_type_name` query parameter so this PR removes it from the request. This should remove the following deprecation log entries on the Monitoring cluster in 7.0.0 onwards:

```
[types removal] Specifying include_type_name in get index template requests is deprecated.
```
2019-02-14 16:09:52 -08:00
Jay Modi 5d06226507
Fix writing of SecurityFeatureSetUsage to pre-7.1 (#38922)
This change makes the writing of new usage data conditional based on
the version that is being written to. A test has also been added to
ensure serialization works as expected to an older version.

Relates #38687, #38917
2019-02-14 16:28:52 -07:00
Jay Modi e59b7b696a
Use consistent view of realms for authentication (#38815)
This change updates the authentication service to use a consistent view
of the realms based on the license state at the start of
authentication. Without this, the license can change during
authentication of a request and it will result in a failure if the
realm that extracted the token is no longer in the realm list. This
manifests in some tests as an authentication failure that should never
really happen; one example would be the test framework's transport
client user should always have a succesful authentication but in the
LicensingTests this can fail and will show up as a
NoNodeAvailableException.

Additionally, the licensing tests have been updated to ensure that
there is consistency when changing the license. The license is changed
by modifying the internal xpack license state on each node, which has
no protection against be changed by some pending cluster action. The
methods to disable and enable now ensure we have a green cluster and
that the cluster is consistent before returning.

Closes #30301
2019-02-14 07:49:14 -07:00
Albert Zaharovits 6243a9797f _cat/indices with Security, hide names when wildcard (#38824)
This changes the output of the `_cat/indices` API with `Security` enabled.

It is possible to only display the index name (and possibly the index
health, depending on the request options) but not its stats (doc count, merges,
size, etc). This is the case for closed indices which have index metadata in the
cluster state but no associated shards, hence no shard stats.
However, when `Security` is enabled, and the request contains wildcards,
**open** indices without stats are a common occurrence. This is because the
index names in the response table are picked up directly from the cluster state
which is not filtered by `Security`'s _indexNameExpressionResolver_, unlike the
stats data which is populated by the indices stats API which does go through the
index name resolver.
This is a bug, because it is circumventing `Security`'s function to hide
unauthorized indices.

This has been fixed by displaying the index names as they are resolved by the indices
stats API. The outputs of these two APIs is now very similar: same index names,
similar data but different format.

Closes #37190
2019-02-14 15:09:17 +02:00
Andrei Stefan 7d78f4641b SQL: fall back to using the field name for column label (#38842)
(cherry picked from commit 0567bf24957be477e7649cff94872b0e7dc4d284)
2019-02-14 14:10:59 +02:00
Yogesh Gaikwad 335cf91bb9
Add enabled status for token and api key service (#38687) (#38882)
Right now there is no way to determine whether the
token service or API key service is enabled or not.
This commit adds support for the enabled status of
token and API key service to the security feature set
usage API `/_xpack/usage`.

Closes #38535
2019-02-14 23:08:52 +11:00
Martijn van Groningen 96e7d71948
Handle the fact that `ShardStats` instance may have no commit or seqno stats (#38782)
The should fix the following NPE:

```
[2019-02-11T23:27:48,452][WARN ][o.e.p.PersistentTasksNodeService] [node_s_0] task kD8YzUhHTK6uKNBNQI-1ZQ-0 failed with an exception
  1> java.lang.NullPointerException: null
  1>    at org.elasticsearch.xpack.ccr.action.ShardFollowTasksExecutor.lambda$fetchFollowerShardInfo$7(ShardFollowTasksExecutor.java:305) ~[main/:?]
  1>    at org.elasticsearch.action.ActionListener$1.onResponse(ActionListener.java:61) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:68) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.TransportAction$1.onResponse(TransportAction.java:64) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onCompletion(TransportBroadcastByNodeAction.java:383) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction.onNodeResponse(TransportBroadcastByNodeAction.java:352) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:324) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$AsyncAction$1.handleResponse(TransportBroadcastByNodeAction.java:314) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleResponse(TransportService.java:1108) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.processResponse(TransportService.java:1189) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1169) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:54) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:417) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.action.support.broadcast.node.TransportBroadcastByNodeAction$BroadcastByNodeTransportRequestHandler.messageReceived(TransportBroadcastByNodeAction.java:391) [elasticsearch-8.0.0-SNAP
SHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:63) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.transport.TransportService$7.doRun(TransportService.java:687) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:751) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-8.0.0-SNAPSHOT.jar:8.0.0-SNAPSHOT]
  1>    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_202]
  1>    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_202]
  1>    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_202]
```

Relates to #38779
2019-02-14 13:05:21 +01:00
Dimitris Athanasiou 21f76aba28
[ML] Extract base class for integ tests with native processes (#38850) (#38860) 2019-02-14 12:15:00 +02:00
Martijn van Groningen 88489a3f3a
Backport rolling upgrade multi cluster module (#38859)
* Add rolling upgrade multi cluster test module (#38277)

This test starts 2 clusters, each with 3 nodes.
First the leader cluster is started and tests are run against it and
then the follower cluster is started and tests execute against this two cluster.

Then the follower cluster is upgraded, one node at a time.
After that the leader cluster is upgraded, one node at a time.
Every time a node is upgraded tests are ran while both clusters are online.
(and either leader cluster has mixed node versions or the follower cluster)

This commit only tests CCR index following, but could be used for CCS tests as well.
In particular for CCR, unidirectional index following is tested during a rolling upgrade.
During the test several indices are created and followed in the leader cluster before or
while the follower cluster is being upgraded.

This tests also verifies that attempting to follow an index in the upgraded cluster
from the not upgraded cluster fails. After both clusters are upgraded following the
index that previously failed should succeed.

Relates to #37231 and #38037

* Filter out upgraded version index settings when starting index following (#38838)

The `index.version.upgraded` and `index.version.upgraded_string` are likely
to be different between leader and follower index. In the event that
a follower index gets restored on a upgraded node while the leader index
is still on non-upgraded nodes.

Closes #38835
2019-02-14 08:12:14 +01:00
Lee Hinman 60c1dcde88 Only flush Watcher's bulk processor if Watcher is enabled (#38803)
When shutting down Watcher, the `bulkProcessor` is null if watcher has been
disabled in the configuration. This protects the flush and close calls with a
check for watcher enabled to avoid a NullPointerException

Resolves #38798
2019-02-13 16:13:53 -07:00
Tim Brooks ec08581319
Improve CcrRepositoryIT mappings tests (#38817)
Currently we index documents concurrently to attempt to ensure that we
update mappings during the restore process. However, this does not
actually test that the mapping will be correct and is dangerous as it
can lead to a misalignment between the max sequence number and the local
checkpoint. If these are not aligned, peer recovery cannot be completed
without initiating following which this test does not do. That causes
teardown assertions to fail.

This commit removes the concurrent indexing and flushes after the
documents are indexed. Additionally it modifies the mapping specific
test to ensure that there is a mapping update when the restore session
is initiated. This mapping update is picked up at the end of the restore
by the follower.
2019-02-13 13:47:10 -07:00
Julie Tibshirani e769cb4efd Perform precise check for types warnings in cluster restart tests. (#37944)
Instead of using `WarningsHandler.PERMISSIVE`, we only match warnings
that are due to types removal.

This PR also renames `allowTypeRemovalWarnings` to `allowTypesRemovalWarnings`.

Relates to #37920.
2019-02-13 11:28:58 -08:00
Benjamin Trent d2ac05e249
ML allow aliased .ml-anomalies* index on PUT Job (#38821) (#38847) 2019-02-13 10:58:55 -06:00
Jake Landis 46bb663a09
Make 7.x like 6.7 user agent ecs, but default to true (#38828)
Forward port of https://github.com/elastic/elasticsearch/pull/38757

This change reverts the initial 7.0 commits and replaces them
with the 6.7 variant that still allows for the ecs flag. 
This commit differs from the 6.7 variants in that ecs flag will 
now default to true. 

6.7: `ecs` : default `false`
7.x: `ecs` : default `true`
8.0: no option, but behaves as `true`

* Revert "Ingest node - user agent, move device to an object (#38115)"
This reverts commit 5b008a34aa.

* Revert "Add ECS schema for user-agent ingest processor (#37727) (#37984)"
This reverts commit cac6b8e06f.

* cherry-pick 5dfe1935345da3799931fd4a3ebe0b6aa9c17f57 
Add ECS schema for user-agent ingest processor (#37727)

* cherry-pick ec8ddc890a34853ee8db6af66f608b0ad0cd1099 
Ingest node - user agent, move device to an object (#38115) (#38121)
  
* cherry-pick f63cbdb9b426ba24ee4d987ca767ca05a22f2fbb (with manual merge fixes)
Dep. check for ECS changes to User Agent processor (#38362)

* make true the default for the ecs option, and update 7.0 references and tests
2019-02-13 10:28:01 -06:00
Przemyslaw Gomulka 542ee5f46a
Format Watcher.status.lastChecked and lastMetCondition (#38788) backport#38626
Change the formatting for Watcher.status.lastCheck and lastMetCondition
to be the same as Watcher.status.state.timestamp. These should all have
only millisecond precision
closes #38619
backport #38626
2019-02-13 08:33:53 +01:00
Shaunak Kashyap a9178b3239
Remove _type term filters from cluster alert watches (#38819) (#38826)
Backport of https://github.com/elastic/elasticsearch/pull/38819. Original message:

This PR removes usages of the `_type` field in `_search` requests issued from Monitoring code.
2019-02-12 19:54:36 -08:00
Nhat Nguyen a3f39741be Adjust log and unmute testFailOverOnFollower (#38762)
There were two documents (seq=2 and seq=103) missing on the follower in
one of the failures of `testFailOverOnFollower`. I spent several hours
on that failure but could not figure out the reason. I adjust log and
unmute this test so we can collect more information.

Relates #38633
2019-02-12 11:42:25 -05:00
Jay Modi f04bd4a07e
Remove TLSv1.2 pinning in ssl reload tests (#38651)
This change removes the pinning of TLSv1.2 in the
SSLConfigurationReloaderTests that had been added to workaround an
issue with the MockWebServer and Apache HttpClient when using TLSv1.3.
The way HttpClient closes the socket causes issues with the TLSv1.3
SSLEngine implementation that causes the MockWebServer to loop
endlessly trying to send the close message back to the client. This
change wraps the created http connection in a way that allows us to
override the closing behavior of HttpClient.

An upstream request with HttpClient has been opened at
https://issues.apache.org/jira/browse/HTTPCORE-571 to see if the method
of closing can be special cased for SSLSocket instances.

This is caused by a JDK bug, JDK-8214418 which is fixed by
https://hg.openjdk.java.net/jdk/jdk12/rev/5022a4915fe9.

Relates #38646
2019-02-12 09:18:04 -07:00
Martijn van Groningen 40d5beaf41
muted test
Relates to #38779
2019-02-12 16:54:54 +01:00
Marios Trivyzas 032bcf99d6
SQL: Implement `::` cast operator (#38774)
`<expression>::<dataType>` is a simplified altenative syntax to
`CAST(<expression> AS <dataType> which exists in PostgreSQL and
provides an improved user experience and possibly more compact
SQL queries.

Fixes: #38717
2019-02-12 16:54:14 +02:00
Alpar Torok 085b6b5f89
Fix failing bwc test against 6.3 (#38770) 2019-02-12 14:18:52 +02:00
Przemyslaw Gomulka 7e178aa4a7
Enable IndexActionTests and WatcherIndexingListenerTests Backport #38738
fix tests to use clock in milliseconds precision in watcher code
make sure the date comparison in string format is using same formatters
some of the code was modified in #38514 possibly because of merge conflicts

closes #38581
Backport #38738
2019-02-12 13:05:44 +01:00
Alexander Reelsen 6ae7915b9d Fix exporter tests to have reasonable dates (#38436)
The java time formatter used in the exporter adds a plus sign to the
year, if a year with more than five digits is used. This changes the
creation of those timestamp to only have a date up to 9999.

Closes #38378
2019-02-12 10:39:44 +01:00
Martijn van Groningen 6290d59ffa
Use clear cluster names in order to make debugging easier.
Relates to #37681
2019-02-12 10:19:39 +01:00
Yannick Welsch bafc709326 Fix CCR concurrent file chunk fetching bug (#38736)
Fixes a bug with concurrent file chunk fetching during recovery from remote where the wrong offset
was used.
2019-02-11 19:15:57 +01:00
Tanguy Leroux dc212de822
Specialize pre-closing checks for engine implementations (#38702) (#38722)
The Close Index API has been refactored in 6.7.0 and it now performs 
pre-closing sanity checks on shards before an index is closed: the maximum 
sequence number must be equals to the global checkpoint. While this is a 
strong requirement for regular shards, we identified the need to relax this 
check in the case of CCR following shards.

The following shards are not in charge of managing the max sequence 
number or global checkpoint, which are pulled from a leader shard. They 
also fetch and process batches of operations from the leader in an unordered 
way, potentially leaving gaps in the history of ops. If the following shard lags 
a lot it's possible that the global checkpoint and max seq number never get 
in sync, preventing the following shard to be closed and a new PUT Follow 
action to be issued on this shard (which is our recommended way to 
resume/restart a CCR following).

This commit allows each Engine implementation to define the specific 
verification it must perform before closing the index. In order to allow 
following/frozen/closed shards to be closed whatever the max seq number 
or global checkpoint are, the FollowingEngine and ReadOnlyEngine do 
not perform any check before the index is closed.

Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
2019-02-11 17:34:17 +01:00
Luca Cavanna 6443b46184
Clean up ShardSearchLocalRequest (#38574)
Added a constructor accepting `StreamInput` as argument, which allowed to
make most of the instance members final as well as remove the default
constructor.
Removed a test only constructor in favour of invoking the existing
constructor that takes a `SearchRequest` as first argument.
Also removed profile members and related methods as they were all unused.
2019-02-11 15:55:46 +01:00
Martijn van Groningen 92201ef563
Catch AlreadyClosedException and use other IndexShard instance (#38630)
Closes #38617
2019-02-11 15:36:48 +01:00
Alpar Torok bd4ca4c702 Rename integTest to bwcTestSample for bwc test projects (#38433)
* Rename integTest to bwcTestSample for bwc test projects

This change renames the `integTest` task to `bwcTestSample` for projects
testing bwc to make it possible to run all the bwc tests that check
would run without running on bwc tests.

This change makes it possible to add a new PR check on backports to make
sure these don't break BWC tests in master.

* Rename task as per PR
2019-02-11 15:05:16 +02:00
Andrei Stefan b3695750bc Randomize the time zone properly for the current date test. (#38670)
(cherry picked from commit 29abbb8a590cdf4f9e0c0b447d6694bb7223648e)
2019-02-11 14:25:02 +02:00
Przemyslaw Gomulka ba9a4d13e1
mute Failing tests related to logging and joda-java migration backport(#38704)(#38710)
the tests awaits fix from #38693 and #38705 and #38581
2019-02-11 13:15:12 +01:00
Przemyslaw Gomulka ab9e2f2e69
Move testToUtc test to DateFormattersTests #38698 Backport #38610
The test was relying on toString in ZonedDateTime which is different to
what is formatted by strict_date_time when milliseconds are 0
The method is just delegating to dateFormatter, so that scenario should
be covered there.

closes #38359
Backport #38610
2019-02-11 11:34:25 +01:00
Alexander Reelsen 73fcea4d2c Remove ticks in chain input documentation (#38109)
The ticks created a literal string instead of actually accessing the
payload value.
2019-02-11 11:04:32 +01:00
Ioannis Kakavas 8c624e5a20 Enhance parsing of StatusCode in SAML Responses (#38628)
* Enhance parsing of StatusCode in SAML Responses

<Status> elements in a failed response might contain two nested
<StatusCode> elements. We currently only parse the first one in
order to create a message that we attach to the Exception we return
and log. However this is generic and only gives out informarion
about whether the SAML IDP believes it's an error with the
request or if it couldn't handle the request for other reasons. The
encapsulated StatusCode has a more interesting error message that
potentially gives out the actual error as in Invalid nameid policy,
authentication failure etc.

This change ensures that we print that information also, and removes
Message and Details fields from the message when these are not
part of the Status element (which quite often is the case)
2019-02-11 11:55:26 +02:00
Martijn van Groningen a29bf2585e
Added unit test for FollowParameters class (#38500) (#38690)
A unit test that tests FollowParameters directly was missing.
2019-02-11 10:53:04 +01:00
Przemyslaw Gomulka 0e5a734e7e
Fix HistoryIntegrationTests timestamp comparison #38565 Backport#38505
When the millisecond part of a timestamp is 0 the toString
representation in java-time is omitting the millisecond part (joda was
not). The Search response is returning timestamps formatted with
WatcherDateTimeUtils, therefore comparisons of strings should be done
with the same formatter

relates #27330
BackPort #38505
2019-02-11 08:50:21 +01:00
Martijn van Groningen 4625807505
Reuse FollowParameters' parse fields. (#38508) 2019-02-11 08:46:36 +01:00
Martijn van Groningen e213ad3e88
Mute test.
Relates to #38695
2019-02-11 08:32:42 +01:00
Tim Vernum 273edea712
Mute testExpiredApiKeysDeletedAfter1Week (#38683)
Tracked: #38408
2019-02-11 16:50:10 +11:00
Tim Brooks 023e3c207a
Concurrent file chunk fetching for CCR restore (#38656)
Adds the ability to fetch chunks from different files in parallel, configurable using the new `ccr.indices.recovery.max_concurrent_file_chunks` setting, which defaults to 5 in this PR.

The implementation uses the parallel file writer functionality that is also used by peer recoveries.
2019-02-09 21:19:57 -07:00
Nhat Nguyen c202900915
Retry on wait_for_metada_version timeout (#38521)
Closes #37807
Backport of #38521
2019-02-09 19:51:58 -05:00
Costin Leau 794ee4fb10 SQL: Prevent grouping over grouping functions (#38649)
Improve verifier to disallow grouping over grouping functions (e.g.
HISTOGRAM over HISTOGRAM).

Close #38308

(cherry picked from commit 4e9b1cfd4df38c652bba36b4b4b538ce7c714b6e)
2019-02-09 09:30:06 +02:00
Marios Trivyzas 871036bd21
SQL: Relax StackOverflow circuit breaker for constants (#38572)
Constant numbers (of any form: integers, decimals, negatives,
scientific) and strings shouldn't increase the depth counters
as they don't contribute to the increment of the stack depth.

Fixes: #38571
2019-02-09 09:18:21 +02:00
Marios Trivyzas af8a444caa
SQL: Replace joda with java time (#38437)
Replace remaining usages of joda classes with java time.

Fixes: #37703
2019-02-08 22:58:07 +02:00
Benjamin Trent 24a8ea06f5
ML: update set_upgrade_mode, add logging (#38372) (#38538)
* ML: update set_upgrade_mode, add logging

* Attempt to fix datafeed isolation

Also renamed a few methods/variables for clarity and added
some comments
2019-02-08 12:56:04 -06:00
Christoph Büscher d03b386f6a Mute FollowerFailOverIT testFailOverOnFollower (#38634)
Relates to #38633
2019-02-08 17:20:30 +01:00
Andrei Stefan 6359d988f0 Account for a possible rolled over file while reading the audit log file (#34909)
(cherry picked from commit 75cb6b38ed67dc9d32c9291b0c174ffa94e473bc)
2019-02-08 17:49:00 +02:00
Christoph Büscher 779673c792 Mute failing WatchStatusIntegrationTests (#38621)
Relates to #38619
2019-02-08 13:56:47 +01:00
Christoph Büscher 5180b36547 Mute failing ApiKeyIntegTests (#38614) 2019-02-08 13:04:17 +01:00
Jason Tedor fdf6b3f23f
Add 7.1 version constant to 7.x branch (#38513)
This commit adds the 7.1 version constant to the 7.x branch.

Co-authored-by: Andy Bristol <andy.bristol@elastic.co>
Co-authored-by: Tim Brooks <tim@uncontended.net>
Co-authored-by: Christoph Büscher <cbuescher@posteo.de>
Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>
Co-authored-by: markharwood <markharwood@gmail.com>
Co-authored-by: Ioannis Kakavas <ioannis@elastic.co>
Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>
Co-authored-by: David Roberts <dave.roberts@elastic.co>
Co-authored-by: Jason Tedor <jason@tedor.me>
Co-authored-by: Alpar Torok <torokalpar@gmail.com>
Co-authored-by: David Turner <david.turner@elastic.co>
Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Tim Vernum <tim@adjective.org>
Co-authored-by: Albert Zaharovits <albert.zaharovits@gmail.com>
2019-02-07 16:32:27 -05:00
Marios Trivyzas f96bd2ad71
SQL: Fix issue with IN not resolving to underlying keyword field (#38440)
- Add resolution to the exact keyword field (if exists) for text fields.
- Add proper verification and error message if underlying keyword
doesn'texist.
- Move check for field attribute in the comparison list to the
`resolveType()` method of `IN`.

Fixes: #38424
2019-02-06 16:25:06 +02:00
David Turner 5a3c452480
Align docs etc with new discovery setting names (#38492)
In #38333 and #38350 we moved away from the `discovery.zen` settings namespace
since these settings have an effect even though Zen Discovery itself is being
phased out. This change aligns the documentation and the names of related
classes and methods with the newly-introduced naming conventions.
2019-02-06 11:34:38 +00:00
Costin Leau 1a02445ae1
SQL: Allow look-ahead resolution of aliases for WHERE clause (#38450)
Aliases defined in SELECT (Project or Aggregate) are now resolved in the
following WHERE clause. The Analyzer has been enhanced to identify this
rule and replace the field accordingly.

Close #29983
2019-02-06 12:08:32 +02:00
Yogesh Gaikwad 6ff4a8cfd5
Add API key settings documentation (#38490)
This commit adds missing
API key service settings documentation.
2019-02-06 20:58:22 +11:00
Luca Cavanna a7046e001c
Remove support for maxRetryTimeout from low-level REST client (#38085)
We have had various reports of problems caused by the maxRetryTimeout
setting in the low-level REST client. Such setting was initially added
in the attempts to not have requests go through retries if the request
already took longer than the provided timeout.

The implementation was problematic though as such timeout would also
expire in the first request attempt (see #31834), would leave the
request executing after expiration causing memory leaks (see #33342),
and would not take into account the http client internal queuing (see #25951).

Given all these issues, it seems that this custom timeout mechanism 
gives little benefits while causing a lot of harm. We should rather rely 
on connect and socket timeout exposed by the underlying http client 
and accept that a request can overall take longer than the configured 
timeout, which is the case even with a single retry anyways.

This commit removes the `maxRetryTimeout` setting and all of its usages.
2019-02-06 08:43:47 +01:00
Yogesh Gaikwad 5261673349
Change the min supported version to 6.7.0 for API keys (#38481)
This commit changes the minimum supported version to 6.7.0
for API keys, the change for the API keys has been backported
to 6.7.0 version #38399
2019-02-06 16:03:49 +11:00
Jay Modi e73c9c90ee
Add an authentication cache for API keys (#38469)
This commit adds an authentication cache for API keys that caches the
hash of an API key with a faster hash. This will enable better
performance when API keys are used for bulk or heavy searching.
2019-02-05 18:16:26 -07:00
Yogesh Gaikwad 57600c5acb
Enable logs for intermittent test failure (#38426)
I have not been able to reproduce the failing
test scenario locally for #38408 and there are other similar
tests which are running fine in the same test class.
I am re-enabling the test with additional logs so
that we can debug further on what's happening.
I will keep the issue open for now and look out for the builds
to see if there are any related failures.
2019-02-06 11:21:54 +11:00
Martijn van Groningen 8972ebabdd
Enable bwc tests now that #38443 is backported. (#38462) 2019-02-06 00:04:43 +01:00
Tim Brooks fb0ec26fd4
Set update mappings mater node timeout to 30 min (#38439)
This is related to #35975. We do not want a slow master to fail a
recovery from remote process due to a slow put mappings call. This
commit increases the master node timeout on this call to 30 mins.
2019-02-05 16:22:11 -06:00
Zachary Tong f939c3c5ef
Assert job is not null in FullClusterRestartIT (#38218)
`waitForRollUpJob` is an assertBusy that waits for the rollup job
to appear in the tasks list, and waits for it to be a certain state.

However, there was a null check around the state assertion, which meant
if the job _was_ null, the assertion would be skipped, and the
assertBusy would pass withouot an exception.  This could then lead to
downstream assertions to fail because the job was not actually ready,
or in the wrong state.

This changes the test to assert the job is not null, so the assertBusy
operates as intended.
2019-02-05 17:06:28 -05:00
Marios Trivyzas 2c30501c74
SQL: Fix esType for DATETIME/DATE and INTERVALS (#38179)
Since introduction of data types that don't have a corresponding type
in ES the `esType` is error-prone when used for `unmappedType()` calls.
Moreover since the renaming of `DATE` to `DATETIME` and the introduction
of an actual date-only `DATE` the `esType` would return `datetime` which
is not a valid type for ES mapping.

Fixes: #38051
2019-02-05 23:12:52 +02:00
Ioannis Kakavas 1f4f6f35c8 Handle deprecation header-AbstractUpgradeTestCase (#38396) 2019-02-05 22:11:21 +01:00
Przemyslaw Gomulka afcdbd2bc0
XPack: core/ccr/Security-cli migration to java-time (#38415)
part of the migrating joda time work.
refactoring x-pack plugins usages of joda to java-time
refers #27330
2019-02-05 22:09:32 +01:00
Jay Modi 7ca5495d86
Allow custom authorization with an authorization engine (#38358)
For some users, the built in authorization mechanism does not fit their
needs and no feature that we offer would allow them to control the
authorization process to meet their needs. In order to support this,
a concept of an AuthorizationEngine is being introduced, which can be
provided using the security extension mechanism.

An AuthorizationEngine is responsible for making the authorization
decisions about a request. The engine is responsible for knowing how to
authorize and can be backed by whatever mechanism a user wants. The
default mechanism is one backed by roles to provide the authorization
decisions. The AuthorizationEngine will be called by the
AuthorizationService, which handles more of the internal workings that
apply in general to authorization within Elasticsearch.

In order to support external authorization services that would back an
authorization engine, the entire authorization process has become
asynchronous, which also includes all calls to the AuthorizationEngine.

The use of roles also leaked out of the AuthorizationService in our
existing code that is not specifically related to roles so this also
needed to be addressed. RequestInterceptor instances sometimes used a
role to ensure a user was not attempting to escalate their privileges.
Addressing this leakage of roles meant that the RequestInterceptor
execution needed to move within the AuthorizationService and that
AuthorizationEngines needed to support detection of whether a user has
more privileges on a name than another. The second area where roles
leaked to the user is in the handling of a few privilege APIs that
could be used to retrieve the user's privileges or ask if a user has
privileges to perform an action. To remove the leakage of roles from
these actions, the AuthorizationService and AuthorizationEngine gained
methods that enabled an AuthorizationEngine to return the response for
these APIs.

Ultimately this feature is the work included in:
#37785
#37495
#37328
#36245
#38137
#38219

Closes #32435
2019-02-05 13:39:29 -07:00
Boaz Leskes 033ba725af
Remove support for internal versioning for concurrency control (#38254)
Elasticsearch has long [supported](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) compare and set (a.k.a optimistic concurrency control) operations using internal document versioning. Sadly that approach is flawed and can sometime do the wrong thing. Here's the relevant excerpt from the resiliency status page:

> When a primary has been partitioned away from the cluster there is a short period of time until it detects this. During that time it will continue indexing writes locally, thereby updating document versions. When it tries to replicate the operation, however, it will discover that it is partitioned away. It won’t acknowledge the write and will wait until the partition is resolved to negotiate with the master on how to proceed. The master will decide to either fail any replicas which failed to index the operations on the primary or tell the primary that it has to step down because a new primary has been chosen in the meantime. Since the old primary has already written documents, clients may already have read from the old primary before it shuts itself down. The version numbers of these reads may not be unique if the new primary has already accepted writes for the same document 

We recently [introduced](https://www.elastic.co/guide/en/elasticsearch/reference/6.x/optimistic-concurrency-control.html) a new sequence number based approach that doesn't suffer from this dirty reads problem. 

This commit removes support for internal versioning as a concurrency control mechanism in favor of the sequence number approach.

Relates to #1078
2019-02-05 20:53:35 +01:00
Tim Brooks 4a15e2b29e
Make Ccr recovery file chunk size configurable (#38370)
This commit adds a byte setting `ccr.indices.recovery.chunk_size`. This
setting configs the size of file chunk requested while recovering from
remote.
2019-02-05 13:34:00 -06:00
Tim Brooks c2a8fe1f91
Prevent CCR recovery from missing documents (#38237)
Currently the snapshot/restore process manually sets the global
checkpoint to the max sequence number from the restored segements. This
does not work for Ccr as this will lead to documents that would be
recovered in the normal followering operation from being recovered.

This commit fixes this issue by setting the initial global checkpoint to
the existing local checkpoint.
2019-02-05 13:32:41 -06:00
Julie Tibshirani 3ce7d2c9b6
Make sure to reject mappings with type _doc when include_type_name is false. (#38270)
`CreateIndexRequest#source(Map<String, Object>, ... )`, which is used when
deserializing index creation requests, accidentally accepts mappings that are
nested twice under the type key (as described in the bug report #38266).

This in turn causes us to be too lenient in parsing typeless mappings. In
particular, we accept the following index creation request, even though it
should not contain the type key `_doc`:

```
PUT index?include_type_name=false
{
  "mappings": {
    "_doc": {
      "properties": { ... }
    }
  }
}
```

There is a similar issue for both 'put templates' and 'put mappings' requests
as well.

This PR makes the minimal changes to detect and reject these typed mappings in
requests. It does not address #38266 generally, or attempt a larger refactor
around types in these server-side requests, as I think this should be done at a
later time.
2019-02-05 10:52:32 -08:00
Christoph Büscher ca47f68091
Ignore type-removal warnings in XPackRestTestHelper (#38431)
The backport of #38022 introduced types-deprecation warning for get/put template requests
that cause problems on tests master in mixed cluster scenarios. While these warnings are
caught and ignored in regular Rest tests, the get template requests in XPackRestTestHelper
were missed.

Closes #38412
2019-02-05 19:07:53 +01:00
Zachary Tong 54e684bedd
testHlrcFromXContent() should respect assertToXContentEquivalence() (#38232)
Tests can override assertToXContentEquivalence() in case their xcontent
cannot be directly compared (e.g. due to insertion order in maps
affecting the xcontent ordering).  But the `testHlrcFromXContent` test
hardcoded the equivalence test to `true` instead of consulting
`assertToXContentEquivalence()`

Fixes #36034
2019-02-05 12:59:05 -05:00
David Turner f2dd5dd6eb
Remove DiscoveryPlugin#getDiscoveryTypes (#38414)
With this change we no longer support pluggable discovery implementations. No
known implementations of `DiscoveryPlugin` actually override this method, so in
practice this should have no effect on the wider world. However, we were using
this rather extensively in tests to provide the `test-zen` discovery type. We
no longer need a separate discovery type for tests as we no longer need to
customise its behaviour.

Relates #38410
2019-02-05 17:42:24 +00:00
Przemyslaw Gomulka 963b474f2f
Fix the clock resolution to millis in GetWatchResponseTests (#38405)
the clock resolution changed from jdk8->jdk10, hence the test is passing
in jdk8 but failing in jdk10. The Watcher's objects are serialised and
deserialised with milliseconds precision, making test to fail in jdk 10
and higher

closes #38400
2019-02-05 18:27:24 +01:00
Przemyslaw Gomulka df4eb0485d
Enable CronEvalToolTest.testEnsureDateIsShownInRootLocale (#38394)
The test is now expected to be always passing no matter what the random
locale is. This is fixed with using jdk ZoneId.systemDefault() in both
the test and CronEvalTool

closes #35687
2019-02-05 17:48:47 +01:00
Marios Trivyzas c9701be1e8
SQL: Implement CURRENT_DATE (#38175)
Since DATE data type is now available, this implements the
`CURRENT_DATE/CURRENT_DATE()/TODAY()` similar to `CURRENT_TIMESTAMP`.

Closes: #38160
2019-02-05 18:15:26 +02:00
Armin Braun 887fa2c97a
Mute testReadRequestsReturnLatestMappingVersion (#38438)
* Relates #37807
2019-02-05 17:10:12 +01:00
David Roberts 92bc681705
[ML] Report index unavailable instead of waiting for lazy node (#38423)
If a job cannot be assigned to a node because an index it
requires is unavailable and there are lazy ML nodes then
index unavailable should be reported as the assignment
explanation rather than waiting for a lazy ML node.
2019-02-05 16:10:00 +00:00
Martijn van Groningen 0beb3c93d1
Clean up duplicate follow config parameter code (#37688)
Introduced FollowParameters class that put follow, resume follow,
put auto follow pattern requests and follow info response classes reuse.

The FollowParameters class had the fields, getters etc. for the common parameters
that all these APIs have.  Also binary and xcontent serialization /
parsing is handled by this class.

The follow, resume follow, put auto follow pattern request classes originally
used optional non primitive fields, so FollowParameters has that too and the follow info api can handle that now too.

Also the followerIndex field can in production only be specified via
the url path. If it is also specified via the request body then
it must have the same value as is specified in the url path. This
option only existed to xcontent testing. However the AbstractSerializingTestCase
base class now also supports createXContextTestInstance() to provide
a different test instance when testing xcontent, so allowing followerIndex
to be specified via the request body is no longer needed.

By moving the followerIndex field from Body to ResumeFollowAction.Request
class and not allowing the followerIndex field to be specified via
the request body the Body class is redundant and can be removed. The
ResumeFollowAction.Request class can then directly use the
FollowParameters class.

For consistency I also removed the ability to specified followerIndex
in the put follow api and the name in put auto follow pattern api via
the request body.
2019-02-05 17:05:19 +01:00
Jason Tedor 638ba4a59a
Mute failing API key integration test (#38409)
This commit mutes the test
testGetAndInvalidateApiKeysWithExpiredAndInvalidatedApiKey as it failed
during a PR build.
2019-02-05 06:08:03 -05:00
Andrei Stefan cea81b199d
Change the milliseconds precision to 3 digits for intervals. (#38297) 2019-02-05 12:00:49 +02:00
Albert Zaharovits 8e2eb39cef
SecuritySettingsSource license.self_generated: trial (#38233)
Authn is enabled only if `license_type` is non `basic`, but `basic` is
what the `LicenseService` generates implicitly. This commit explicitly sets
license type to `trial`, which allows for authn, in the `SecuritySettingsSource`
which is the settings configuration parameter for `InternalTestCluster`s.

The real problem, that had created tests failures like #31028 and #32685, is
that the check `licenseState.isAuthAllowed()` can change sporadically. If it were
to return `true` or `false` during the whole test there would be no problem.
The problem manifests when it turns from `true` to `false` right before `Realms.asList()`.
There are other license checks before this one (request filter, token service, etc)
that would not cause a problem if they would suddenly see the check as `false`.
But switching to `false` before `Realms.asList()` makes it appear that no installed
realms could have handled the authn token which is an authentication error, as can
be seen in the failing tests.

Closes #31028 #32685
2019-02-05 10:49:08 +02:00
David Turner 3b2a0d7959
Rename no-master-block setting (#38350)
Replaces `discovery.zen.no_master_block` with `cluster.no_master_block`. Any
value set for the old setting is now ignored.
2019-02-05 08:47:56 +00:00
David Turner 2d114a02ff
Rename static Zen1 settings (#38333)
Renames the following settings to remove the mention of `zen` in their names:

- `discovery.zen.hosts_provider` -> `discovery.seed_providers`
- `discovery.zen.ping.unicast.concurrent_connects` -> `discovery.seed_resolver.max_concurrent_resolvers`
- `discovery.zen.ping.unicast.hosts.resolve_timeout` -> `discovery.seed_resolver.timeout`
- `discovery.zen.ping.unicast.hosts` -> `discovery.seed_addresses`
2019-02-05 08:46:52 +00:00
Brandon Kobel 64ff75f04e
Add apm_user reserved role (#38206)
* Adding apm_user

* Fixing SecurityDocumentationIT testGetRoles test

* Adding access to .ml-anomalies-*

* Fixing APM test, we don't have access to the ML state index
2019-02-04 21:45:28 -08:00
Yogesh Gaikwad fe36861ada
Add support for API keys to access Elasticsearch (#38291)
X-Pack security supports built-in authentication service
`token-service` that allows access tokens to be used to 
access Elasticsearch without using Basic authentication.
The tokens are generated by `token-service` based on
OAuth2 spec. The access token is a short-lived token
(defaults to 20m) and refresh token with a lifetime of 24 hours,
making them unsuitable for long-lived or recurring tasks where
the system might go offline thereby failing refresh of tokens.

This commit introduces a built-in authentication service
`api-key-service` that adds support for long-lived tokens aka API
keys to access Elasticsearch. The `api-key-service` is consulted
after `token-service` in the authentication chain. By default,
if TLS is enabled then `api-key-service` is also enabled.
The service can be disabled using the configuration setting.

The API keys:-
- by default do not have an expiration but expiration can be
  configured where the API keys need to be expired after a
  certain amount of time.
- when generated will keep authentication information of the user that
   generated them.
- can be defined with a role describing the privileges for accessing
   Elasticsearch and will be limited by the role of the user that
   generated them
- can be invalidated via invalidation API
- information can be retrieved via a get API
- that have been expired or invalidated will be retained for 1 week
  before being deleted. The expired API keys remover task handles this.

Following are the API key management APIs:-
1. Create API Key - `PUT/POST /_security/api_key`
2. Get API key(s) - `GET /_security/api_key`
3. Invalidate API Key(s) `DELETE /_security/api_key`

The API keys can be used to access Elasticsearch using `Authorization`
header, where the auth scheme is `ApiKey` and the credentials, is the 
base64 encoding of API key Id and API key separated by a colon.
Example:-
```
curl -H "Authorization: ApiKey YXBpLWtleS1pZDphcGkta2V5" http://localhost:9200/_cluster/health
```

Closes #34383
2019-02-05 14:21:57 +11:00
Yogesh Gaikwad 9d3f057894
Limit token expiry to 1 hour maximum (#38244)
We mention in our documentation for the token
expiration configuration maximum value is 1 hour
but do not enforce it. This commit adds max limit
to the TOKEN_EXPIRATION setting.
2019-02-05 12:02:36 +11:00
Yogesh Gaikwad b5b319ec9a
Skip unsupported languages for tests (#38328)
Skip the languages in tests for which SimpleKdcServer
does not handle generalized time correctly.

Closes#38320
2019-02-05 11:01:13 +11:00
Gordon Brown b866417650
Mute testCannotShrinkLeaderIndex (#38374)
This test should not pass until CCR finishes integrating shard history
retention leases. It currently sometimes passes (which is a bug in the
test), but cannot pass reliably until the linked issue is resolved.
2019-02-04 16:06:19 -07:00
Nhat Nguyen cecfa5bd6d
Tighten mapping syncing in ccr remote restore (#38071)
There are two issues regarding the way that we sync mapping from leader
to follower when a ccr restore is completed:

1.  The returned mapping from a cluster service might not be up to date
as the mapping of the restored index commit.

2. We should not compare the mapping version of the follower and the
leader. They are not related to one another.

Moreover, I think we should only ensure that once the restore is done,
the mapping on the follower should be at least the mapping of the copied
index commit. We don't have to sync the mapping which is updated after
we have opened a session.

Relates #36879
Closes #37887
2019-02-04 17:53:41 -05:00
Tim Brooks 5a33816c86
Add test for `PutFollowAction` on a closed index (#38236)
This is related to #35975. Currently when an index falls behind a leader
it encounters a fatal exception. This commit adds a test for that
scenario. Additionally, it tests that the user can stop following, close
the follower index, and put follow again. After the indexing is
re-bootstrapped, it will recover the documents it lost in normal
following operations.
2019-02-04 16:37:42 -06:00
Jay Modi c3cdf84c04
Fix SSLContext pinning to TLSV1.2 in reload tests (#38341)
This commit fixes the pinning of SSLContexts to TLSv1.2 in the
SSLConfigurationReloaderTests. The pinning was added for the initial
creation of clients and webservers but the updated contexts would
default to TLSv1.3, which is known to cause hangs with the
MockWebServer that we use.

Relates #38103
Closes #38247
2019-02-04 14:34:37 -07:00
Nhat Nguyen fb1e350c81
Mute testFollowIndexAndCloseNode (#38360)
Tracked at #33337
2019-02-04 15:04:46 -05:00
Shaunak Kashyap be1bb0ec7d
Remove types from Monitoring plugin "backend" code (#37745)
This PR removes the use of document types from the monitoring exporters and template + watches setup code.

It does not remove the notion of types from the monitoring bulk API endpoint "front end" code as that code will eventually just go away in 8.0 and be replaced with Beats as collectors/shippers directly to the monitoring cluster.
2019-02-04 10:58:03 -08:00
Gordon Brown f872c721ac
Run Node deprecation checks locally (#38065) (#38250)
At times, we need to check for usage of deprecated settings in settings
which should not be returned by the NodeInfo API.  This commit changes
the deprecation info API to run all node checks locally so that these
settings can be checked without exposing them via any externally
accessible API.
2019-02-04 09:43:28 -07:00
Jason Tedor 625d37a26a
Introduce retention lease background sync (#38262)
This commit introduces a background sync for retention leases. The idea
here is that we do a heavyweight sync when adding a new retention lease,
and then periodically we want to background sync any retention lease
renewals to the replicas. As long as the background sync interval is
significantly lower than the extended lifetime of a retention lease, it
is okay if from time to time a replica misses a sync (it will still have
an older version of the lease that is retaining more data as we assume
that renewals do not decrease the retaining sequence number). There are
two follow-ups that will come after this commit. The first is to address
the fact that we have not adapted the should periodically flush logic to
possibly flush the retention leases. We want to do something like flush
if we have not flushed in the last five minutes and there are renewed
retention leases since the last time that we flushed. An additional
follow-up will remove the syncing of retention leases when a retention
lease expires. Today this sync could be invoked in the background by a
merge operation. Rather, we will move the syncing of retention lease
expiration to be done under the background sync. The background sync
will use the heavyweight sync (write action) if a lease has expired, and
will use the lightweight background sync (replication action) otherwise.
2019-02-04 10:35:29 -05:00
David Roberts fb6a176caf
[ML] Add explanation so far to file structure finder exceptions (#38191)
The explanation so far can be invaluable for troubleshooting
as incorrect decisions made early on in the structure analysis
can result in seemingly crazy decisions or timeouts later on.

Relates elastic/kibana#29821
2019-02-04 14:32:35 +00:00
Boaz Leskes e49b593c81
Move TokenService to seqno powered cas (#38311)
Relates #37872 
Relates #10708
2019-02-04 15:25:41 +01:00
Przemyslaw Gomulka 9b64558efb
Migrating from joda to java.time. Watcher plugin (#35809)
part of the migrating joda time work. Migrating watcher plugin to use JDK's java-time

refers #27330
2019-02-04 15:08:31 +01:00
Przemyslaw Gomulka 85b4bfe3ff
Core: Migrating from joda to java.time. Monitoring plugin (#36297)
monitoring plugin migration from joda to java.time

refers #27330
2019-02-04 14:47:08 +01:00
Christoph Büscher 7ed3e6e07e
Mute MlMigrationFullClusterRestartIT#testMigration (#38315) 2019-02-04 11:38:01 +01:00
Boaz Leskes ff13a43144
Move ML Optimistic Concurrency Control to Seq No (#38278)
This commit moves the usage of internal versioning for CAS operations to use sequence numbers and primary terms

Relates to #36148
Relates to #10708
2019-02-04 10:41:08 +01:00
David Turner 1d82a6d9f9
Deprecate unused Zen1 settings (#38289)
Today the following settings in the `discovery.zen` namespace are still used:

- `discovery.zen.no_master_block`
- `discovery.zen.hosts_provider`
- `discovery.zen.ping.unicast.concurrent_connects`
- `discovery.zen.ping.unicast.hosts.resolve_timeout`
- `discovery.zen.ping.unicast.hosts`

This commit deprecates all other settings in this namespace so that they can be
removed in the next major version.
2019-02-04 08:52:08 +00:00
Tim Vernum 0164acb0a7
Cleanup construction of interceptors (#38294)
It would be beneficial to apply some of the request interceptors even
when features are disabled. This change reworks the way we build that
list so that the interceptors we always want to use are constructed
outside of the settings check.
2019-02-04 17:27:41 +11:00
Costin Leau 75f0750ff7
SQL: Remove exceptions from Analyzer (#38260)
Instead of throwing an exception, use an unresolved attribute to pass
the message to the Verifier.
Additionally improve the parser to save the extended source for the
Aggregate and OrderBy.

Close #38208
2019-02-03 22:32:16 +02:00
Costin Leau a088155f4d
SQL: Move metrics tracking inside PlanExecutor (#38259)
Move metrics in one place, from the transport layer inside the
PlanExecutor
Remove unused class

Close #38258
2019-02-03 22:31:35 +02:00
Albert Zaharovits 3c1544d259
Fix NPE in Logfile Audit Filter (#38120)
The culprit in #38097 is an `IndicesRequest` that has no indices,
but instead of `request.indices()` returning `null` or `String[0]`
it returned `String[] {null}` . This tripped the audit filter.

I have addressed this in two ways:
1. `request.indices()` returning `String[] {null}` is treated as `null`
    or `String[0]`, i.e. no indices
2. `null` values among the roles and indices lists, which are
    unexpected, will never again stumble the audit filter; `null` values
    are treated as special values that will not match any policy,
    i.e. their events will always be printed.

Closes #38097
2019-02-03 10:34:17 +02:00
Andrei Stefan 6968f0925b
SQL: Generate relevant error message when grouping functions are not used in GROUP BY (#38017)
* Add checks for Grouping functions restriction to be placed inside GROUP BY
* Fixed bug where GROUP BY HISTOGRAM (not using alias) wasn't recognized
properly in the Verifier due to functions equality not working correctly.
2019-02-02 22:05:47 +02:00
Gordon Brown 475a045192
Mute tests in SSLConfigurationReloaderTests (#38248)
Specifically `testReloadingTrustStore` and `testReloadingPEMTrustConfig`
2019-02-01 21:00:58 -07:00
Gordon Brown 7a1e89c7ed
Ensure ILM policies run safely on leader indices (#38140)
Adds a Step to the Shrink and Delete actions which prevents those
actions from running on a leader index - all follower indices must first
unfollow the leader index before these actions can run. This prevents
the loss of history before follower indices are ready, which might
otherwise result in the loss of data.
2019-02-01 20:46:12 -07:00
Boaz Leskes f6e06a2b19 Adapt minimum versions for seq# powered operations in Watch related requests and UpdateRequest (#38231)
After backporting #37977, #37857 and #37872
2019-02-01 20:37:16 -05:00
Costin Leau 783c9ed372
SQL: Allow sorting of groups by aggregates (#38042)
Introduce client-side sorting of groups based on aggregate
functions. To allow this, the Analyzer has been extended to push down
to underlying Aggregate, aggregate function and the Querier has been
extended to identify the case and consume the results in order and sort
them based on the given columns.
The underlying QueryContainer has been slightly modified to allow a view
of the underlying values being extracted as the columns used for sorting
might not be requested by the user.

The PR also adds minor tweaks, mainly related to tree output.

Close #35118
2019-02-02 01:38:25 +02:00
Jason Tedor f181e17038
Introduce retention leases versioning (#37951)
Because concurrent sync requests from a primary to its replicas could be
in flight, it can be the case that an older retention leases collection
arrives and is processed on the replica after a newer retention leases
collection has arrived and been processed. Without a defense, in this
case the replica would overwrite the newer retention leases with the
older retention leases. This commit addresses this issue by introducing
a versioning scheme to retention leases. This versioning scheme is used
to resolve out-of-order processing on the replica. We persist this
version into Lucene and restore it on recovery. The encoding of
retention leases is starting to get a little ugly. We can consider
addressing this in a follow-up.
2019-02-01 17:19:19 -05:00
Tal Levy bae656dcea
Preserve ILM operation mode when creating new lifecycles (#38134)
There was a bug where creating a new policy would start
the ILM service, even if it was stopped. This change ensures
that there is no change to the existing operation mode
2019-02-01 13:16:34 -08:00
Nhat Nguyen 3ecdfe1060
Enable trace log in FollowerFailOverIT (#38148)
This suite still fails one per week sometimes with a worrying assertion.
Sadly we are still unable to find the actual source.

Expected: <SeqNoStats{maxSeqNo=229, localCheckpoint=86, globalCheckpoint=86}>
but: was   <SeqNoStats{maxSeqNo=229, localCheckpoint=-1, globalCheckpoint=86}>

This change enables trace log in the suite so we will have a better
picture if this fails again.

Relates #3333
2019-02-01 15:44:39 -05:00
Julie Tibshirani c2e9d13ebd
Default include_type_name to false in the yml test harness. (#38058)
This PR removes the temporary change we made to the yml test harness in #37285
to automatically set `include_type_name` to `true` in index creation requests
if it's not already specified. This is possible now that the vast majority of
index creation requests were updated to be typeless in #37611. A few additional
tests also needed updating here.

Additionally, this PR updates the test harness to set `include_type_name` to
`false` in index creation requests when communicating with 6.x nodes. This
mirrors the logic added in #37611 to allow for typeless document write requests
in test set-up code. With this update in place, we can remove many references
to `include_type_name: false` from the yml tests.
2019-02-01 11:44:13 -08:00
Nhat Nguyen f64b20383e
Replace awaitBusy with assertBusy in atLeastDocsIndexed (#38190)
Unlike assertBusy, awaitBusy does not retry if the code-block throws an
AssertionError. A refresh in atLeastDocsIndexed can fail because we call
this method while we are closing some node in FollowerFailOverIT.
2019-02-01 13:31:17 -05:00
Benjamin Trent 5db305023d
ML: Fix error race condition on stop _all datafeeds and close _all jobs (#38113)
* ML: Ignore when task is not found for _all

* Addressing PR comments

* Update TransportStopDatafeedAction.java
2019-02-01 11:16:35 -06:00
Shaunak Kashyap cc7c42d7e2
Allow built-in monitoring_user role to call GET _xpack API (#38060)
This PR adds the `monitor/xpack/info` cluster-level privilege to the built-in `monitoring_user` role.

This privilege is required for the Monitoring UI to call the `GET _xpack API` on the Monitoring Cluster. It needs to do this in order to determine the license of the Monitoring Cluster, which further determines whether Cluster Alerts are shown to the user or not.

Resolves #37970.
2019-02-01 08:56:34 -08:00
David Roberts 1fa413a16d
[ML] Remove "8" prefixes from file structure finder timestamp formats (#38016)
In 7.x Java timestamp formats are the default timestamp format and
there is no need to prefix them with "8".  (The "8" prefix was used
in 6.7 to distinguish Java timestamp formats from Joda timestamp
formats.)

This change removes the "8" prefixes from timestamp formats in the
output of the ML file structure finder.
2019-02-01 15:36:04 +00:00
Jay Modi 2ca22209cd
Enable TLSv1.3 by default for JDKs with support (#38103)
This commit enables the use of TLSv1.3 with security by enabling us to
properly map `TLSv1.3` in the supported protocols setting to the
algorithm for a SSLContext. Additionally, we also enable TLSv1.3 by
default on JDKs that support it.

An issue was uncovered with the MockWebServer when TLSv1.3 is used that
ultimately winds up in an endless loop when the client does not trust
the server's certificate. Due to this, SSLConfigurationReloaderTests
has been pinned to TLSv1.2.

Closes #32276
2019-02-01 08:34:11 -07:00
Tim Vernum 6fcbd07420
Remove heuristics that enable security on trial licenses (#38075)
In 6.3 trial licenses were changed to default to security
disabled, and ee added some heuristics to detect when security should
be automatically be enabled if `xpack.security.enabled` was not set.

This change removes those heuristics, and requires that security be
explicitly enabled (via the `xpack.security.enabled` setting) for
trial licenses.

Relates: #38009
2019-02-01 17:59:13 +11:00
Tim Brooks 291c4e7a0c
Fix file reading in ccr restore service (#38117)
Currently we use the raw byte array length when calling the IndexInput
read call to determine how many bytes we want to read. However, due to
how BigArrays works, the array length might be longer than the reference
length. This commit fixes the issue and uses the BytesRef length when
calling read. Additionally, it expands the index follow test to index
many more documents. These documents should potentially lead to large
enough segment files to trigger scenarios where this fix matters.
2019-01-31 18:02:24 -07:00
Nhat Nguyen 6c1e9fad47 Mute testAutoFollowing
Tracked at #37231
2019-01-31 16:57:53 -05:00
Benjamin Trent be381b4525
ML: better handle task state race condition (#38040) 2019-01-31 11:07:54 -06:00
Henning Andersen 68ed72b923
Handle scheduler exceptions (#38014)
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.

This is a continuation of #28667, #36137 and also fixes #37708.
2019-01-31 17:51:45 +01:00
Tim Brooks b8575c6aa3
Update PutFollowAction serialization post-backport (#37989)
This commit modifies the `PutFollowRequest` to reflect the fact that
active shard functionality has been backported to 6.7.
2019-01-31 09:31:22 -07:00
Alpar Torok b7de8e1d1e Mute failing test
Tracking #38100
2019-01-31 17:01:16 +02:00
Alpar Torok f15d7b9b91 Mute failing test
Tracking #38027
2019-01-31 16:55:52 +02:00
Marios Trivyzas 4710a7472f
SQL: Implement FIRST/LAST aggregate functions (#37936)
FIRST and LAST can be used with one argument and work similarly to MIN
and MAX but they are implemented using a Top Hits aggregation and
therefore can also operate on keyword fields. When a second argument is
provided then they return the first/last value of the first arg when its
values are ordered ascending/descending (respectively) by the values of
the second argument. Currently because of the usage of a Top Hits
aggregation FIRST and LAST cannot be used in the HAVING clause of a
GROUP BY query to filter on the results of the aggregation.

Closes: #35639
2019-01-31 16:33:05 +02:00
Luca Cavanna 622fb7883b
Introduce ability to minimize round-trips in CCS (#37828)
With #37566 we have introduced the ability to merge multiple search responses into one. That makes it possible to expose a new way of executing cross-cluster search requests, that makes CCS much faster whenever there is network latency between the CCS coordinating node and the remote clusters. The coordinating node can now send a single search request to each remote cluster, which gets reduced by each one of them. from + size results are requested to each cluster, and the reduce phase in each cluster is non final (meaning that buckets are not pruned and pipeline aggs are not executed). The CCS coordinating node performs an additional, final reduction, which produces one search response out of the multiple responses received from the different clusters.

This new execution path will be activated by default for any CCS request unless a scroll is provided or inner hits are requested as part of field collapsing. The search API accepts now a new parameter called ccs_minimize_roundtrips that allows to opt-out of the default behaviour.

Relates to #32125
2019-01-31 15:12:14 +01:00
Josh Soref 0154052335 spelling: java script -- not JavaScript (#37057) 2019-01-31 14:09:36 +02:00
Tim Vernum cde126dbff
Enable SSL in reindex with security QA tests (#37600)
Update the x-pack/qa/reindex-tests-with-security integration tests to
run with TLS enabled on the Rest interface.

Relates: #37527
2019-01-31 20:59:50 +11:00
Andrei Stefan 22d3290078
SQL: Added SSL configuration options tests (#37875)
* Added SSL configuration options tests
Removed the allow.self.signed option from the documentation since we allow
by default self signed certificates as well.

* Added more tests
2019-01-31 10:52:49 +02:00
Alexander Reelsen b94acb608b
Speed up converting of temporal accessor to zoned date time (#37915)
The existing implementation was slow due to exceptions being thrown if
an accessor did not have a time zone. This implementation queries for
having a timezone, local time and local date and also checks for an
instant preventing to throw an exception and thus speeding up the conversion.

This removes the existing method and create a new one named
DateFormatters.from(TemporalAccessor accessor) to resemble the naming of
the java time ones.

Before this change an epoch millis parser using the toZonedDateTime
method took approximately 50x longer.

Relates #37826
2019-01-31 08:55:40 +01:00
Nhat Nguyen 1a93976ff7
Correct arg names when update mapping/settings from leader (#38063)
These two arguments are not named incorrectly and caused confusion.
2019-01-31 02:45:42 -05:00
Boaz Leskes b11732104f
Move watcher to use seq# and primary term for concurrency control (#37977)
* move watcher to seq# occ

* top level set

* fix parsing and missing setters

* share toXContent for PutResponse and rest end point

* fix redacted password

* fix username reference

* fix deactivate-watch.asciidoc have seq no references

* add seq# + term to activate-watch.asciidoc

* more doc fixes
2019-01-30 20:14:59 -05:00
Jake Landis dad41c2b7f
ILM setPriority corrections for a 0 value (#38001)
This commit fixes the test case that ensures only a priority
less then 0 is used with testNonPositivePriority. This also
allows the HLRC to support a value of 0.

Closes #37652
2019-01-30 17:38:47 -06:00
Tal Levy 7c738fd241
Skip Shrink when numberOfShards not changed (#37953)
Previously, ShrinkAction would fail if
it was executed on an index that had
the same number of shards as the target
shrunken number.

This PR introduced a new BranchingStep that
is used inside of ShrinkAction to branch which
step to move to next, depending on the
shard values. So no shrink will occur if the
shard count is unchanged.
2019-01-30 15:09:17 -08:00
Tim Brooks b88bdfe958
Add dispatching to `HandledTransportAction` (#38050)
This commit allows implementors of the `HandledTransportAction` to
specify what thread the action should be executed on. The motivation for
this commit is that certain CCR requests should be performed on the
generic threadpool.
2019-01-30 15:40:49 -07:00
Jay Modi 54dbf9469c
Update httpclient for JDK 11 TLS engine (#37994)
The apache commons http client implementations recently released
versions that solve TLS compatibility issues with the new TLS engine
that supports TLSv1.3 with JDK 11. This change updates our code to
use these versions since JDK 11 is a supported JDK and we should
allow the use of TLSv1.3.
2019-01-30 14:24:29 -07:00
Tim Brooks aeab55e8d1
Reduce flaxiness of ccr recovery timeouts test (#38035)
This fixes #38027. Currently we assert that all shards have failed.
However, it is possible that some shards do not have segement files
created yet. The action that we block is fetching these segement files
so it is possible that some shards successfully recover.

This commit changes the assertion to ensure that at least some of the
shards have failed.
2019-01-30 14:13:23 -07:00
David Roberts be788160ef
[ML] Datafeed deprecation checks (#38026)
Deprecation checks for the ML datafeed
query and aggregations.
2019-01-30 20:12:20 +00:00
David Turner 81c443c9de
Deprecate minimum_master_nodes (#37868)
Today we pass `discovery.zen.minimum_master_nodes` to nodes started up in
tests, but for 7.x nodes this setting is not required as it has no effect.
This commit removes this setting so that nodes are started with more realistic
configurations, and deprecates it.
2019-01-30 20:09:15 +00:00
Jake Landis 6a78b6a31c
Remove types from watcher docs (#38002)
Types have been deprecated and this commit removes the documentation
for specifying types in the index action, and search input/transform.

Relates #37594 #35190
2019-01-30 13:12:13 -06:00
Martijn van Groningen 5433af28e3
Fixed test bug, lastFollowTime is null if there are no follower indices. 2019-01-30 19:33:16 +01:00
Jason Tedor 6500b0cbd7
Expose retention leases in shard stats (#37991)
This commit exposes retention leases via shard-level stats.
2019-01-30 13:20:40 -05:00
Benjamin Trent 9782aaa1b8
ML: Add reason field in JobTaskState (#38029)
* ML: adding reason to job failure status

* marking reason as nullable

* Update AutodetectProcessManager.java
2019-01-30 11:56:24 -06:00
Albert Zaharovits 53e80e9814 Fix failure in test code ClusterPrivilegeTests
Closes #38030
2019-01-30 16:11:44 +02:00
Benjamin Trent 8280a20664
ML: Add upgrade mode docs, hlrc, and fix bug (#37942)
* ML: Add upgrade mode docs, hlrc, and fix bug

* [DOCS] Fixes build error and edits text

* adjusting docs

* Update docs/reference/ml/apis/set-upgrade-mode.asciidoc

Co-Authored-By: benwtrent <ben.w.trent@gmail.com>

* Update set-upgrade-mode.asciidoc

* Update set-upgrade-mode.asciidoc
2019-01-30 06:51:11 -06:00
Andrei Stefan 908c8def06
SQL: Skip the nested and object field types in case of an ODBC request (#37948) 2019-01-30 11:34:47 +02:00
Adrien Grand c8af0f4bfa
Use mappings to format doc-value fields by default. (#30831)
Doc-value fields now return a value that is based on the mappings rather than
the script implementation by default.

This deprecates the special `use_field_mapping` docvalue format which was added
in #29639 only to ease the transition to 7.x and it is not necessary anymore in
7.0.
2019-01-30 10:31:51 +01:00
Martijn van Groningen f51bc00fcf
Added ccr to xpack usage infrastructure (#37256)
* Added ccr to xpack usage infrastructure

Closes #37221
2019-01-30 07:58:26 +01:00
Tim Vernum 99129d7786
Fix exit code for Security CLI tools (#37956)
The certgen, certutil and saml-metadata tools did not correctly return
their exit code to the calling shell.

These commands now explicitly exit with the code that was returned
from the main(args, terminal) method.
2019-01-30 17:51:11 +11:00
Tim Brooks 55b916afc0
Ensure task metadata not null in follow test (#37993)
This commit fixes a potential race in the IndexFollowingIT. Currently it
is possible that we fetch the task metadata, it is null, and that throws
a null pointer exception. Assertbusy does not catch null pointer
exceptions. This commit assertions that the metadata is not null.
2019-01-29 15:58:31 -07:00
Tim Brooks f3f9cabd67
Add timeout for ccr recovery action (#37840)
This is related to #35975. It adds a action timeout setting that allows
timeouts to be applied to the individual transport actions that are
used during a ccr recovery.
2019-01-29 12:29:06 -07:00
Marios Trivyzas e9332331a3
SQL: Make error msg for validation of 2nd arg of PERCENTILE[_RANK] consistent (#37937)
Use `first` and `second` instead of `1st` and `2nd`.
2019-01-29 21:20:09 +02:00
Albert Zaharovits 697b2fbe52
Remove implicit index monitor privilege (#37774)
Restricted indices (currently only .security-6 and .security) are special
internal indices that require setting the `allow_restricted_indices` flag
on every index permission that covers them. If this flag is `false`
(default) the permission will not cover these and actions against them
will not be authorized.
However, the monitoring APIs were the only exception to this rule.

This exception is herein forfeited and index monitoring privileges have to be
granted explicitly, using the `allow_restricted_indices` flag on the permission,
as is the case for any other index privilege.
2019-01-29 21:10:03 +02:00
Benjamin Trent 34d61d3231
ML: ignore unknown fields for JobTaskState (#37982) 2019-01-29 12:51:34 -06:00
Tim Brooks 00ace369af
Use `CcrRepository` to init follower index (#35719)
This commit modifies the put follow index action to use a
CcrRepository when creating a follower index. It routes 
the logic through the snapshot/restore process. A 
wait_for_active_shards parameter can be used to configure
how long to wait before returning the response.
2019-01-29 11:47:29 -07:00
David Roberts 5f106a27ea
[ML] Add meta information to all ML indices (#37964)
This change adds a _meta field storing the version in which
the index mappings were last updated to the 3 ML indices
that didn't previously have one:

- .ml-annotations
- .ml-meta
- .ml-notifications

All other ML indices already had such a _meta field.

This field will be useful if we ever need to automatically
update the index mappings during a future upgrade.
2019-01-29 15:41:35 +00:00
David Kyle 6d1693ff49 [ML] Prevent submit after autodetect worker is stopped (#37700)
Runnables can be submitted to
AutodetectProcessManager.AutodetectWorkerExecutorService
without error after it has been shutdown which can lead
to requests timing out as their handlers are never called
by the terminated executor.

This change throws an EsRejectedExecutionException if a
runnable is submitted after after the shutdown and calls
AbstractRunnable.onRejection on any tasks not run.

Closes #37108
2019-01-29 15:09:40 +00:00
Luca Cavanna 2325fb9cb3
Remove test only SearchShardTarget constructor (#37912)
Remove SearchShardTarget test only constructor and replace all the usages with calls to the other constructor that accepts a ShardId.
2019-01-29 14:58:11 +01:00
Przemyslaw Gomulka 4f4113e964
Rename security audit.log to _audit.json (#37916)
in order to keep json logs consistent the security audit logs are renamed from .log to .json
relates #32850
2019-01-29 14:53:55 +01:00
Tanguy Leroux 460f10ce60
Close Index API should force a flush if a sync is needed (#37961)
This commit changes the TransportVerifyShardBeforeCloseAction so that it issues a 
forced flush, forcing the translog and the Lucene commit to contain the same max seq 
number and global checkpoint in the case the Translog contains operations that were 
not written in the IndexWriter (like a Delete that touches a non existing doc). This way 
the assertion added in #37426 won't trip.

Related to #33888
2019-01-29 13:15:58 +01:00
Henrique Gonçalves eceb3185c7 [ML] Make GetJobStats work with arbitrary wildcards and groups (#36683)
The /_ml/anomaly_detectors/{job}/_stats endpoint now
works correctly when {job} is a wildcard or job group.

Closes #34745
2019-01-29 09:06:50 +00:00
Dimitris Athanasiou ebe9c95230
[ML] Audit all errors during job deletion (#37933)
This commit moves the auditing of job deletion related errors
to the final listener in the job delete action. This ensures
any error that occurs during job deletion is audited.
2019-01-29 10:23:50 +02:00
Przemyslaw Gomulka 891320f5ac
Elasticsearch support to JSON logging (#36833)
In order to support JSON log format, a custom pattern layout was used and its configuration is enclosed in ESJsonLayout. Users are free to use their own patterns, but if smooth Beats integration is needed, they should use ESJsonLayout. EvilLoggerTests are left intact to make sure user's custom log patterns work fine.

To populate additional fields node.id and cluster.uuid which are not available at start time, 
a cluster state update will have to be received and the values passed to log4j pattern converter.
A ClusterStateObserver.Listener is used to receive only one ClusteStateUpdate. Once update is received the nodeId and clusterUUid are set in a static field in a NodeAndClusterIdConverter. 

Following fields are expected in JSON log lines: type, tiemstamp, level, component, cluster.name, node.name, node.id, cluster.uuid, message, stacktrace
see ESJsonLayout.java for more details and field descriptions

Docker log4j2 configuration is now almost the same as the one use for ES binary. 
The only difference is that docker is using console appenders, whereas ES is using file appenders.

relates: #32850
2019-01-29 07:20:09 +01:00
Like 6ed35fbb94 Support merge nested Map in list for JIRA configurations (#37634)
This commit allows JIRA API fields that require a list of key/value 
pairs (maps), such as JIRA "components" to use use template snippets 
(e.g. {{ctx.payload.foo}}). Prior to this change the templated value 
(not the de-referenced value) would be sent via the API and error. 

Closes #30068
2019-01-28 18:01:09 -06:00
Gordon Brown 49bd8715ff
Inject Unfollow before Rollover and Shrink (#37625)
We inject an Unfollow action before Shrink because the Shrink action
cannot be safely used on a following index, as it may not be fully
caught up with the leader index before the "original" following index is
deleted and replaced with a non-following Shrunken index. The Unfollow
action will verify that 1) the index is marked as "complete", and 2) all
operations up to this point have been replicated from the leader to the
follower before explicitly disconnecting the follower from the leader.

Injecting an Unfollow action before the Rollover action is done mainly
as a convenience: This allow users to use the same lifecycle policy on
both the leader and follower cluster without having to explictly modify
the policy to unfollow the index, while doing what we expect users to
want in most cases.
2019-01-28 14:09:12 -07:00
Nhat Nguyen 557fcf915e
Wait for mapping in testReadRequestsReturnLatestMappingVersion (#37886)
If the index request is executed before the mapping update is applied on
the IndexShard, the index request will perform a dynamic mapping update.
This mapping update will be timeout (i.e, ProcessClusterEventTimeoutException)
because the latch is not open. This leads to the failure of the index
request and the test. This commit makes sure the mapping is ready
before we execute the index request.

Closes #37807
2019-01-28 15:25:56 -05:00
Jake Landis 99b75a9bdf
deprecate types for watcher (#37594)
This commit adds deprecation warnings for index actions
and search actions when executed via watcher. Unit and 
integration tests updated accordingly. 

relates #35190
2019-01-28 13:46:43 -06:00
Luca Cavanna 0a850f032b Handle deprecation warnings in a permissive manner
Relates to #37290
2019-01-28 16:36:39 +01:00
Benjamin Trent 7e4c0e6991
ML: Adds set_upgrade_mode API endpoint (#37837)
* ML: Add MlMetadata.upgrade_mode and API

* Adding tests

* Adding wait conditionals for the upgrade_mode call to return

* Adding tests

* adjusting format and tests

* Adjusting wait conditions for api return and msgs

* adjusting doc tests

* adding upgrade mode tests to black list
2019-01-28 09:07:30 -06:00
David Kyle c0409fb9f0
[ML] Marginal gains in slow multi node QA tests (#37825)
Move 2 tests that are simple rest tests and out of the QA suite and cut the number
of post data calls in ForecastIT
2019-01-28 10:00:59 +00:00
David Roberts 57d321ed5f
[ML] Tighten up use of aliases rather than concrete indices (#37874)
We have read and write aliases for the ML results indices.  However,
the job still had methods that purported to reliably return the name
of the concrete results index being used by the job.  After reindexing
prior to upgrade to 7.x this will be wrong, so the method has been
renamed and the comments made more explicit to say the returned index
name may not be the actual concrete index name for the lifetime of the
job.  Additionally, the selection of indices when deleting the job
has been changed so that it works regardless of concrete index names.

All these changes are nice-to-have for 6.7 and 7.0, but will become
critical if we add rolling results indices in the 7.x release stream
as 6.7 and 7.0 nodes may have to operate in a mixed version cluster
that includes a version that can roll results indices.
2019-01-28 09:38:46 +00:00
Martijn van Groningen 4e1a779773
Prepare ShardFollowNodeTask to bootstrap when it fall behind leader shard (#37562)
* Changed `LuceneSnapshot` to throw an `OperationsMissingException` if the requested ops are missing.
* Changed the shard changes api to handle the `OperationsMissingException` and wrap the exception into `ResourceNotFound` exception and include metadata to indicate the requested range can no longer be retrieved.
* Changed `ShardFollowNodeTask` to handle this `ResourceNotFound` exception with the included metdata header.

Relates to #35975
2019-01-28 09:30:04 +01:00
Dimitrios Liappis 290c6637c2
Refactor into appropriate uses of scheduleUnlessShuttingDown (#37709)
Replace `threadPool().schedule()` / catch
`EsRejectedExecutionException` pattern with direct calls to
`ThreadPool#scheduleUnlessShuttingDown()`.

Closes #36318
2019-01-28 10:01:26 +02:00
Albert Zaharovits 66ddd8d2f7
Create snapshot role (#35820)
This commit introduces the `create_snapshot` cluster privilege and
the `snapshot_user` role.
This role is to be used by "cronable" tools that call the snapshot API
periodically without recurring to the `manage` cluster privilege. The
`create_snapshot` cluster privilege is much more limited compared to
the `manage` privilege.

The `snapshot_user` role grants the privileges to view the metadata of
all indices (including restricted ones, i.e. .security). It obviously grants the
create snapshot privilege but the repository has to be created using another
role. In addition, it grants the privileges to (only) GET repositories and
snapshots, but not create and delete them.

The role does not allow to create repositories. This distinction is important
because snapshotting equates to the `read` index privilege if the user has
control of the snapshot destination, but this is not the case in this instance,
because the role does not grant control over repository configuration.
2019-01-27 23:07:32 +02:00
Jason Tedor 5fddb631a2
Introduce retention lease syncing (#37398)
This commit introduces retention lease syncing from the primary to its
replicas when a new retention lease is added. A follow-up commit will
add a background sync of the retention leases as well so that renewed
retention leases are synced to replicas.
2019-01-27 07:49:56 -05:00
David Roberts cb134470c1
[TEST] Fix MlMappingsUpgradeIT testMappingsUpgrade (#37769)
Made the test tolerant to index upgrade being run
in between the old/mixed/upgraded portions.  This
can occur because the rolling upgrade tests all
share the same indices.

Fixes #37763
2019-01-27 08:27:40 +00:00
David Roberts f2c0c26d15
[ML] Adjust structure finder for Joda to Java time migration (#37306)
The ML file structure finder has always reported both Joda
and Java time format strings.  This change makes the Java time
format strings the ones that are incorporated into mappings
and ingest pipeline definitions.

The BWC syntax of prepending "8" to these formats is used.
This will need to be removed once Java time format strings
become the default in Elasticsearch.

This commit also removes direct imports of Joda classes in the
structure finder unit tests.  Instead the core Joda BWC class
is used.
2019-01-26 20:19:57 +00:00
Julie Tibshirani 7c130d235a Mute CcrRepositoryIT#testFollowerMappingIsUpdated
Tracked in #37887.
2019-01-25 14:55:47 -08:00
Marios Trivyzas d1ff450edc
SQL: Fix casting from date to numeric type to use millis (#37869)
Previously casting from a DATE[TIME] type to a numeric (DOUBLE, LONG,
INT, etc. used seconds instead of the epoch millis.

Fixes: #37655
2019-01-25 23:29:10 +02:00
Benjamin Trent 9e932f4869
ML: removing unnecessary upgrade code (#37879) 2019-01-25 13:57:41 -06:00