Commit Graph

2434 Commits

Author SHA1 Message Date
Gordon Brown 88b9810567
Remove obsolete deprecation checks (#37510)
* Remove obsolete deprecation checks

This also updates the old-indices check to be appropriate for the 7.x
series of releases, and leaves it as the only deprecation check in
place.

* Add toString to DeprecationIssue

* Bring filterChecks across from 6.x

* License headers
2019-01-18 14:24:34 -07:00
Benjamin Trent 12cdf1cba4
ML: Add support for single bucket aggs in Datafeeds (#37544)
Single bucket aggs are now supported in datafeed aggregation configurations.
2019-01-18 15:08:53 -06:00
Benjamin Trent 5384162a42
ML: creating ML State write alias and pointing writes there (#37483)
* ML: creating ML State write alias and pointing writes there

* Moving alias check to openJob method

* adjusting concrete index lookup for ml-state
2019-01-18 14:32:34 -06:00
Martijn van Groningen a3030c51e2 [ILM] Add unfollow action (#36970)
This change adds the unfollow action for CCR follower indices.

This is needed for the shrink action in case an index is a follower index.
This will give the follower index the opportunity to fully catch up with
the leader index, pause index following and unfollow the leader index.
After this the shrink action can safely perform the ilm shrink.

The unfollow action needs to be added to the hot phase and acts as
barrier for going to the next phase (warm or delete phases), so that
follower indices are being unfollowed properly before indices are expected
to go in read-only mode. This allows the force merge action to execute
its steps safely.

The unfollow action has three steps:
* `wait-for-indexing-complete` step: waits for the index in question
  to get the `index.lifecycle.indexing_complete` setting be set to `true`
* `wait-for-follow-shard-tasks` step: waits for all the shard follow tasks
  for the index being handled to report that the leader shard global checkpoint
  is equal to the follower shard global checkpoint.
* `pause-follower-index` step: Pauses index following, necessary to unfollow
* `close-follower-index` step: Closes the index, necessary to unfollow
* `unfollow-follower-index` step: Actually unfollows the index using 
  the CCR Unfollow API
* `open-follower-index` step: Reopens the index now that it is a normal index
* `wait-for-yellow` step: Waits for primary shards to be allocated after
  reopening the index to ensure the index is ready for the next step

In the case of the last two steps, if the index in being handled is
a regular index then the steps acts as a no-op.

Relates to #34648

Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
Co-authored-by: Gordon Brown <gordon.brown@elastic.co>
2019-01-18 13:05:03 -07:00
jaymode 642e45e9e6
Fix setting openldap realm ssl config
This change fixes the setup of the SSL configuration for the test
openldap realm. The configuration was missing the realm identifier so
the SSL settings being used were just the default JDK ones that do not
trust the certificate of the idp fixture.

See #37591
2019-01-18 12:24:11 -07:00
Igor Motov 54af8a4e7a
SQL: fix object extraction from sources (#37502)
Throws an exception if hit extractor tries to retrieve unsupported
object. For example, selecting "a" from `{"a": {"b": "c"}}` now throws
an exception instead of returning null.

Relates to #37364
2019-01-18 14:03:48 -05:00
Martijn van Groningen 6846666b6b
Add ccr follow info api (#37408)
* Add ccr follow info api

This api returns all follower indices and per follower index
the provided parameters at put follow / resume follow time and
whether index following is paused or active.

Closes #37127

* iter

* [DOCS] Edits the get follower info API

* [DOCS] Fixes link to remote cluster

* [DOCS] Clarifies descriptions for configured parameters
2019-01-18 16:37:21 +01:00
Ioannis Kakavas 7597b7ce2b
Add validation for empty PutPrivilegeRequest (#37569)
Return an error to the user if the put privilege api is called with
an empty body (no privileges)

Resolves: #37561
2019-01-18 17:06:40 +02:00
Tim Brooks 978c818d0f
Use RestoreSnapshotRequest in CcrRepositoryIT
Commit #37535 removed an internal restore request in favor of the
RestoreSnapshotRequest. Commit #37449 added a new test that used the
internal restore request. This commit modifies the new test to use the
RestoreSnapshotRequest.
2019-01-17 15:31:27 -07:00
Tim Brooks b6f06a48c0
Implement follower rate limiting for file restore (#37449)
This is related to #35975. This commit implements rate limiting on the
follower side using a new class `CombinedRateLimiter`.
2019-01-17 14:58:46 -07:00
Armin Braun 381d035cd6
Remove Redundant RestoreRequest Class (#37535)
* Same as #37464 but for the restore side
2019-01-17 22:23:23 +01:00
Yannick Welsch 6d64a2a901
Propagate Errors in executors to uncaught exception handler (#36137)
This is a continuation of #28667 and has as goal to convert all executors to propagate errors to the
uncaught exception handler. Notable missing ones were the direct executor and the scheduler. This
commit also makes it the property of the executor, not the runnable, to ensure this property. A big
part of this commit also consists of vastly improving the test coverage in this area.
2019-01-17 17:46:35 +01:00
Jake Landis 587034dfa7
Add set_priority action to ILM (#37397)
This commit adds a set_priority action to the hot, warm, and cold
phases for an ILM policy. This action sets the `index.priority`
on the managed index to allow different priorities between the
hot, warm, and cold recoveries.

This commit also includes the HLRC and documentation changes.

closes #36905
2019-01-17 09:55:36 -06:00
Martijn van Groningen b85bfd3e17
Added fatal_exception field for ccr stats in monitoring mapping. (#37563) 2019-01-17 14:04:41 +01:00
Martijn van Groningen 99b09845da
Moved ccr integration to the package with other ccr integration tests. 2019-01-17 13:57:56 +01:00
Marios Trivyzas 1686c32ba9
SQL: Rename SQL type DATE to DATETIME (#37395)
* SQL: Rename SQL data type DATE to DATETIME

SQL data type DATE has only the date part (e.g.: 2019-01-14)
without any time information. Previously the SQL type DATE was
referring to the ES DATE which contains also the time part along
with TZ information. To conform with SQL data types the data type
`DATE` is renamed to `DATETIME`, since it includes also the time,
as a new runtime SQL `DATE` data type will be introduced down the road,
which only contains the date part and meets the SQL standard.

Closes: #36440

* Address comments
2019-01-17 10:17:58 +02:00
Marios Trivyzas ecf0de30ba
SQL: Lowercase the datatypes in validation error msgs (#37524)
To follow the ES convention display the datatypes in lowercase
in error messages thrown during validation if `IN` and conditional
functions.
2019-01-16 18:41:10 +02:00
Marios Trivyzas 2cf4b1863f
SQL: Lowercase es data type (mapping) returned from SQL Commands (#37531)
To follow the ES convention, convert the es data type, returned as
column `mapping` from SQL Commands, to lowercase.

Fixes: #37521
2019-01-16 18:08:33 +02:00
Costin Leau 19d2c29ca6 Remove SYS CATALOGS leftover
Relates #37506
2019-01-16 17:28:06 +02:00
Costin Leau 1f76b5fc31
SQL: Describe aliases as views (#37496)
When reporting metadata, several clients have issues with the 'ALIAS'
type. To improve compatibility and be consistent with the ANSI SQL
expectations and because they are similar, aliases targets are now
reported as views.

Close #37422
2019-01-16 17:26:00 +02:00
Tim Brooks 0b5af276a8
Allow system privilege to execute proxied actions (#37508)
Currently all proxied actions are denied for the `SystemPrivilege`.
Unfortunately, there are use cases (CCR) where we would like to proxy
actions to a remote node that are normally performed by the
system context. This commit allows the system context to perform
proxy actions if they are actions that the system context is normally
allowed to execute.
2019-01-16 07:52:38 -07:00
Andrei Stefan 659326fdd6
SQL: Add protocol tests and remove jdbc_type from drivers response (#37516) 2019-01-16 16:28:46 +02:00
David Kyle 75410dc632 [Ml] Prevent config snapshot failure blocking migration (#37493) 2019-01-16 11:51:15 +00:00
Costin Leau 023bb2f1e4
SQL: Remove slightly used meta commands (#37506)
Remove SYS CATALOGS and SYS TABLE TYPES as they are a subset of SYS
TABLES (and thus somewhat redundant) and used only by JDBC.

Close #37409
2019-01-16 12:36:35 +02:00
Przemyslaw Gomulka 5e94f384c4
Remove the use of AbstracLifecycleComponent constructor #37488 (#37488)
The AbstracLifecycleComponent used to extend AbstractComponent, so it had to pass settings to the constractor of its supper class.
It no longer extends the AbstractComponent so there is no need for this constructor
There is also no need for AbstracLifecycleComponent subclasses to have Settings in their constructors if they were only passing it over to super constructor.
This is part 1. which will be backported to 6.x with a migration guide/deprecation log.
part 2 will have this constructor removed in 7
relates #35560

relates #34488
2019-01-16 09:05:30 +01:00
Hendrik Muhs 15d1b904a1
[ML] log minimum diskspace setting if forecast fails due to insufficient d… (#37486)
log minimum disk space setting if forecast fails due to insufficient disk space
2019-01-16 08:10:13 +01:00
Jay Modi 987576b013
Consistently use loopback address for ssl profile (#37487)
This change fixes failures in the SslMultiPortTests where we attempt to
connect to a profile on a port it is listening on but the connection
fails. The failure is due to the profile being bound to multiple
addresses and randomization will pick one of these addresses to
determine the listening port. However, the address we get the port for
may not be the address we are actually connecting to. In order to
resolve this, the test now sets the bind host for profiles to the
loopback address and uses the same address for connecting.

Closes #37481
2019-01-15 14:03:21 -07:00
David Kyle bea46f7b52
[ML] Migrate unallocated jobs and datafeeds (#37430)
Migrate ml job and datafeed config of open jobs and update
the parameters of the persistent tasks as they become unallocated
during a rolling upgrade. Block allocation of ml persistent tasks
until the configs are migrated.
2019-01-15 18:21:39 +00:00
David Kyle 7c11b05c28
[ML] Remove unused code from the JIndex project (#37477) 2019-01-15 17:19:58 +00:00
Marios Trivyzas 6129e9d9dd Revert "[TEST] Muted TokenBackwardsCompatibilityIT.*"
This reverts commit 65e42ab63b.

The test is only failing in 6.x not master.
2019-01-15 18:25:53 +02:00
Marios Trivyzas 65e42ab63b [TEST] Muted TokenBackwardsCompatibilityIT.*
Relates to #37379
2019-01-15 18:05:20 +02:00
Martijn van Groningen 9554b2fecb
When removing an AutoFollower also mark it as removed. (#37402)
Currently when there are no more auto follow patterns for a remote cluster then
the AutoFollower instance for this remote cluster will be removed. If
a new auto follow pattern for this remote cluster gets added quickly enough
after the last delete then there may be two AutoFollower instance running
for this remote cluster instead of one.

Each AutoFollower instance stops automatically after it sees in the
start() method that there are no more auto follow patterns for the
remote cluster it is tracking. However when an auto follow pattern
gets removed and then added back quickly enough then old AutoFollower
may never detect that at some point there were no auto follow patterns
for the remote cluster it is monitoring. The creation and removal of
an AutoFollower instance happens independently in the `updateAutoFollowers()`
as part of a cluster state update.

By adding the `removed` field, an AutoFollower instance will not miss the
fact there were no auto follow patterns at some point in time. The
`updateAutoFollowers()` method now marks an AutoFollower instance as
removed when it sees that there are no more patterns for a remote cluster.
The updateAutoFollowers() method can then safely start a new AutoFollower
instance.

Relates to #36761
2019-01-15 16:24:19 +01:00
Jay Modi a56aa4f076
Remove SslNullCipherTests from codebase (#37431)
This change deletes the SslNullCipherTests from our codebase since it
will have issues with newer JDK versions and it is essentially testing
JDK functionality rather than our own. The upstream JDK issue for
disabling these ciphers by default is
https://bugs.openjdk.java.net/browse/JDK-8212823.

Closes #37403
2019-01-15 07:52:58 -07:00
David Roberts 7cdf7f882b
[ML] Fix ML datafeed CCS with wildcarded cluster name (#37470)
The test that remote clusters used by ML datafeeds have
a license that allows ML was not accounting for the
possibility that the remote cluster name could be
wildcarded.  This change fixes that omission.

Fixes #36228
2019-01-15 14:19:05 +00:00
Tanguy Leroux e848388865
Fix SourceOnlySnapshotIT (#37461)
The SourceOnlySnapshotIT class tests a source only repository
using the following scenario:
    starts a master node
    starts a data node
    creates a source only repository
    creates an index with documents
    snapshots the index to the source only repository
    deletes the index
    stops the data node
    starts a new data node
    restores the index

Thanks to ESIntegTestCase the index is sometimes created using a custom 
data path. With such a setting, when a shard is assigned to one of the data 
node of the cluster the shard path is resolved using the index custom data 
path and the node's lock id by the NodeEnvironment#resolveCustomLocation().

It should work nicely but in SourceOnlySnapshotIT.snashotAndRestore(), b
efore the change in this PR, the last data node was restarted using a different 
path.home. At startup time this node was assigned a node lock based on other 
locks in the data directory of this temporary path.home which is empty. So it 
always got the 0 lock id. And when this new data node is assigned a shard for
 the index and resolves it against the index custom data path, it also uses the 
node lock id 0 which conflicts with another node of the cluster, resulting in 
various errors with the most obvious one being LockObtainFailedException.

This commit removes the temporary home path for the last data node so that it 
uses the same path home as other nodes of the cluster and then got assigned 
a correct node lock id at startup.

Closes #36330
Closes #36276
2019-01-15 15:03:09 +01:00
Albert Zaharovits a88c050a05
Docs be explicit on how to turn off deprecated auditing (#37316)
Just be explicit about turning off the deprecated audit log appender
because we really want people to turn it off.
2019-01-15 14:29:32 +02:00
Marios Trivyzas b594e81c86
SQL: Fix issue with field names containing "." (#37364)
Adjust FieldExtractor to handle fields which contain `.` in their
name, regardless where they fall in, in the document hierarchy. E.g.:

```
{
  "a.b": "Elastic Search"
}

{
  "a": {
    "b.c": "Elastic Search"
  }
}

{
  "a.b": {
    "c": {
      "d.e" : "Elastic Search"
    }
  }
}
```

Fixes: #37128
2019-01-15 09:41:41 +02:00
Jason Tedor 3bc0711b90
Add simple method to write collection of writeables (#37448)
This commit adds a simple convenience method for writing a collection of
writeables, and replaces existing call sites with the new method.
2019-01-14 21:28:28 -05:00
Julie Tibshirani 36a3b84fc9
Update the default for include_type_name to false. (#37285)
* Default include_type_name to false for get and put mappings.

* Default include_type_name to false for get field mappings.

* Add a constant for the default include_type_name value.

* Default include_type_name to false for get and put index templates.

* Default include_type_name to false for create index.

* Update create index calls in REST documentation to use include_type_name=true.

* Some minor clean-ups around the get index API.

* In REST tests, use include_type_name=true by default for index creation.

* Make sure to use 'expression == false'.

* Clarify the different IndexTemplateMetaData toXContent methods.

* Fix FullClusterRestartIT#testSnapshotRestore.

* Fix the ml_anomalies_default_mappings test.

* Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests.

We make sure to specify include_type_name=true during xContent parsing,
so we continue to test the legacy typed responses. XContent generation
for the typeless responses is currently only covered by REST tests,
but we will be adding unit test coverage for these as we implement
each typeless API in the Java HLRC.

This commit also refactors GetMappingsResponse to follow the same appraoch
as the other mappings-related responses, where we read include_type_name
out of the xContent params, instead of creating a second toXContent method.
This gives better consistency in the response parsing code.

* Fix more REST tests.

* Improve some wording in the create index documentation.

* Add a note about types removal in the create index docs.

* Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL.

* Make sure to mention include_type_name in the REST docs for affected APIs.

* Make sure to use 'expression == false' in FullClusterRestartIT.

* Mention include_type_name in the REST templates docs.
2019-01-14 13:08:01 -08:00
Jay Modi f3edbe2911
Security: remove SSL settings fallback (#36846)
This commit removes the fallback for SSL settings. While this may be
seen as a non user friendly change, the intention behind this change
is to simplify the reasoning needed to understand what is actually
being used for a given SSL configuration. Each configuration now needs
to be explicitly specified as there is no global configuration or
fallback to some other configuration.

Closes #29797
2019-01-14 14:06:22 -07:00
Shaunak Kashyap b86621c157
Adding mapping for hostname field (#37288)
This new `hostname` field is meant to be a replacement for its sibling `name` field. See https://github.com/elastic/beats/pull/9943, particularly https://github.com/elastic/beats/pull/9943#discussion_r245932581.

This PR simply adds the new field (`hostname`) to the mapping without removing the old one (`name`), because a user might be running an older-version Beat (without this field rename in it) with a newer-version Monitoring ES cluster (with this PR's change in it).

AFAICT the Monitoring UI isn't currently using the `name` field so no changes are necessary there yet. If it decides to start using the `name` field, it will also want to look at the value of the `hostname` field.
2019-01-14 12:41:10 -08:00
Tim Brooks 5c68338a1c
Implement ccr file restore (#37130)
This is related to #35975. It implements a file based restore in the
CcrRepository. The restore transfers files from the leader cluster
to the follower cluster. It does not implement any advanced resiliency
features at the moment. Any request failure will end the restore.
2019-01-14 13:07:55 -07:00
David Kyle 2ee55a50bf
[ML] Use String rep of Version in map for serialisation (#37416) 2019-01-14 16:39:47 +00:00
Martijn van Groningen de852765d6
unmuted test
Relates to #37014
2019-01-14 14:27:42 +01:00
Ioannis Kakavas 374e24c7fd Mute SslNullCipherTests on JDK12
JDK12 doesn't support NULL cipher for TLS by default. This commit
mutes these tests on JDK12 until we decide whether we need to keep
or remove them
2019-01-14 10:50:24 +02:00
Alpar Torok a566bacbc8 Upgrade ASM for java 12 compatability (#37385)
Closes #37371
2019-01-13 09:33:39 -08:00
Albert Zaharovits 6fd57d90da
Security Audit includes HTTP method for requests (#37322)
Adds another field, named "request.method", to the structured logfile audit.
This field is present for all events associated with a REST request (not a
transport request) and the value is one of GET, POST, PUT, DELETE, OPTIONS,
HEAD, PATCH, TRACE and CONNECT.
2019-01-13 15:26:23 +02:00
Costin Leau a4339ec7e9
SQL: Use declared source for error messages (#37161)
Improve error messages by returning the original SQL statement
declaration instead of trying to reproduce it as the casing and
whitespaces are not preserved accurately leading to small 
differences.

Close #37161
2019-01-13 01:40:22 +02:00
Marios Trivyzas 359222c55c
SQL: Make `FULL` non-reserved keyword in the grammar (#37377)
Since `full` can be common as a field name or part of a field name
(e.g.: `full.name` or `name.full`), it's nice if it's not a reserved
keyword of the grammar so a user can use it without resorting to quotes.

Fixes: #37376
2019-01-11 23:08:00 +02:00
Marios Trivyzas 85531f0285
SQL: [Tests] Fix and enable internalClusterTests (#37300)
SqlPlugin cannot have more than one public constructor, so for the testing
purposes the `getLicenseState()` should be overriden.

Fixes: #37191

Co-authored-by: Michael Basnight <mbasnight@gmail.com>
2019-01-11 22:43:17 +02:00
Benjamin Trent 5101e51891
ML: Fix testMigrateConfigs (#37373)
* ML: :s/execute/get

* Fixing other broken tests

* unmuting test
2019-01-11 13:29:30 -06:00
Zachary Tong de52ba1f78 Fix RollupDocumentation test to wait for job to stop
Also adds some extra state debug information to various log messages
2019-01-11 14:14:58 -05:00
Gordon Brown 827ece73c8 Mute MlConfigMigratorIT.testMigrateConfigs (#37374) 2019-01-11 11:11:58 -07:00
Gordon Brown 955d3aea19 Mute testRoundRobinWithFailures (#32190) 2019-01-11 09:38:40 -07:00
David Roberts 953fb9352f
[ML] Update error message for process update (#37363)
When this message was first added the model debug config was
the only thing that could be updated, but now more aspects of
the config can be updated so the message needs to be more
general.
2019-01-11 16:31:55 +00:00
Martijn van Groningen e4391afd98
Test fix, wait for auto follower to have stopped in the background
Relates to #36761
2019-01-11 17:26:17 +01:00
Benjamin Trent 19a7e0f4eb
ML: update .ml-state actions to support > 1 index (#37307)
* ML: Updating .ml-state calls to be able to support > 1 index

* Matching bulk delete behavior with dbq

* Adjusting state name

* refreshing indices before search

* fixing line length

* adjusting index expansion options
2019-01-11 08:03:41 -06:00
David Roberts 1da59db3fb
[ML] Wait for autodetect to be ready in the datafeed (#37349)
This is a reinforcement of #37227.  It turns out that
persistent tasks are not made stale if the node they
were running on is restarted and the master node does
not notice this.  The main scenario where this happens
is when minimum master nodes is the same as the number
of nodes in the cluster, so the cluster cannot elect a
master node when any node is restarted.

When an ML node restarts we need the datafeeds for any
jobs that were running on that node to not just wait
until the jobs are allocated, but to wait for the
autodetect process of the job to start up.  In the case
of reassignment of the job persistent task this was
dealt with by the stale status test.  But in the case
where a node restarts but its persistent tasks are not
reassigned we need a deeper test.

Fixes #36810
2019-01-11 13:22:35 +00:00
Alexander Reelsen bbd093059f
Add whitelist to watcher HttpClient (#36817)
This adds a configurable whitelist to the HTTP client in watcher. By
default every URL is allowed to retain BWC. A dynamically configurable
setting named "xpack.http.whitelist" was added that allows to
configure an array of URLs, which can also contain simple regexes.

Closes #29937
2019-01-11 09:22:47 +01:00
Martijn van Groningen 37493c204d
Unmuted test now that #37239 has been merged and backported.
Relates to #37231
2019-01-11 09:02:46 +01:00
Ioannis Kakavas 80084138dd [DOCS] Fix link to role mapping doc 2019-01-11 09:22:40 +02:00
markharwood 434430506b
Type removal - added deprecation warnings to _bulk apis (#36549)
Added warnings checks to existing tests
Added “defaultTypeIfNull” to DocWriteRequest interface so that Bulk requests can override a null choice of document type with any global custom choice.
Related to #35190
2019-01-10 21:35:19 +00:00
Jay Modi e6d3d85db4
Ensure latch is counted down in ssl reload test (#37313)
This change ensures we always countdown the latch in the
SSLConfigurationReloaderTests to prevent the suite from timing out in
case of an exception. Additionally, we also increase the logging of the
resource watcher in case an IOException occurs.

See #36053
2019-01-10 13:27:25 -07:00
Costin Leau 83f7423cd6
SQL: Fix bug regarding alias fields with dots (#37279)
Field of types aliases that have dots in name are returned without a
hierarchy by field_caps, as oppose to the mapping api or field with
concrete types, which in turn breaks IndexResolver.
This commit fixes this by creating the backing hierarchy similar to the
mapping api.

Close #37224
2019-01-10 22:18:53 +02:00
David Roberts b65006e8cd
[ML] Fix ML memory tracker for old jobs (#37311)
Jobs created in version 6.1 or earlier can have a
null model_memory_limit.  If these are parsed from
cluster state following a full cluster restart then
we replace the null with 4096mb to make the meaning
explicit.  But if such jobs are streamed from an
old node in a mixed version cluster this does not
happen.  Therefore we need to account for the
possibility of a null model_memory_limit in the ML
memory tracker.
2019-01-10 17:28:00 +00:00
Jay Modi 71633775fd
Security: reorder realms based on last success (#36878)
This commit reorders the realm list for iteration based on the last
successful authentication for the given principal. This is an
optimization to prevent unnecessary iteration over realms if we can
make a smart guess on which realm to try first.
2019-01-10 09:06:16 -07:00
Martijn van Groningen 6d81e7c3e7
[CCR] FollowingEngine should fail with 403 if operation has no seqno assigned (#37213)
Fail with a 403 when indexing a document directly into a follower index.

In order to test this change, I had to move specific assertions into a dedicated class and
disable assertions for that class in the rest qa module. I think that is the right trade off.
2019-01-10 15:54:34 +01:00
Martijn van Groningen df488720e0
[CCR] Make shard follow tasks more resilient for restarts (#37239)
If a running shard follow task needs to be restarted and
the remote connection seeds have changed then
a shard follow task currently fails with a fatal error.

The change creates the remote client lazily and adjusts
the errors a shard follow task should retry.

This issue was found in test failures in the recently added
ccr rolling upgrade tests. The reason why this issue occurs
more frequently in the rolling upgrade test is because ccr
is setup in local mode (so remote connection seed will become stale) and
all nodes are restarted, which forces the shard follow tasks to get
restarted at some point during the test. Note that these tests
cannot be enabled yet, because this change will need to be backported
to 6.x first. (otherwise the issue still occurs on non upgraded nodes)

I also changed the RestartIndexFollowingIT to setup remote cluster
via persistent settings and to also restart the leader cluster. This
way what happens during the ccr rolling upgrade qa tests, also happens
in this test.

Relates to #37231
2019-01-10 15:02:30 +01:00
Alpar Torok 3d66764660 Mute watcher SingleNodeTests
Tracking:  #36782
2019-01-10 12:23:29 +02:00
Martijn van Groningen 1a41d84536
[CCR] Resume follow Api should not require a request body (#37217)
Closes #37022
2019-01-10 09:48:26 +01:00
Alexander Reelsen b2e8437424
Tests: Add ElasticsearchAssertions.awaitLatch method (#36777)
* Tests: Add ElasticsearchAssertions.awaitLatch method

Some tests are using assertTrue(latch.await(...)) in their code. This
leads to an assertion error without any error message. This adds a
method which has a nicer error message and can be used in tests.

* fix forbidden apis

* fix spaces
2019-01-10 09:25:36 +01:00
Andrei Stefan 4a92de214a
SQL: Proper handling of COUNT(field_name) and COUNT(DISTINCT field_name) (#37254)
* provide overriden `hashCode` and toString methods to account for `DISTINCT`
* change the analyzer for scenarios where `COUNT <field_name>` and `COUNT DISTINCT` have different paths
* defined a new `filter` aggregation encapsulating an `exists` query to filter out null or missing values
2019-01-10 09:51:51 +02:00
Benjamin Trent df3b58cb04
ML: add migrate anomalies assistant (#36643)
* ML: add migrate anomalies assistant

* adjusting failure handling for reindex

* Fixing request and tests

* Adding tests to blacklist

* adjusting test

* test fix: posting data directly to the job instead of relying on datafeed

* adjusting API usage

* adding Todos and adjusting endpoint

* Adding types to reindexRequest

* removing unreliable "live" data test

* adding index refresh to test

* adding index refresh to test

* adding index refresh to yaml test

* fixing bad exists call

* removing todo

* Addressing remove comments

* Adjusting rest endpoint name

* making service have its own logger

* adjusting validity check for newindex names

* fixing typos

* fixing renaming
2019-01-09 14:25:35 -06:00
jaymode c71060fa01
Test: fix race in auth result propagation test
This commit fixes a race condition in a test introduced by #36900 that
verifies concurrent authentications get a result propagated from the
first thread that attempts to authenticate. Previously, a thread may
be in a state where it had not attempted to authenticate when the first
thread that authenticates finishes the authentication, which would
cause the test to fail as there would be an additional authentication
attempt. This change adds additional latches to ensure all threads have
attempted to authenticate before a result gets returned in the
thread that is performing authentication.
2019-01-09 12:17:43 -07:00
Tim Brooks cfa58a51af
Add TLS/SSL channel close timeouts (#37246)
Closing a channel using TLS/SSL requires reading and writing a
CLOSE_NOTIFY message (for pre-1.3 TLS versions). Many implementations do
not actually send the CLOSE_NOTIFY message, which means we are depending
on the TCP close from the other side to ensure channels are closed. In
case there is an issue with this, we need a timeout. This commit adds a
timeout to the channel close process for TLS secured channels.

As part of this change, we need a timer service. We could use the
generic Elasticsearch timeout threadpool. However, it would be nice to
have a local to the nio event loop timer service dedicated to network needs. In
the future this service could support read timeouts, connect timeouts,
request timeouts, etc. This commit adds a basic priority queue backed
service. Since our timeout volume (channel closes) is very low, this
should be fine. However, this can be updated to something more efficient
in the future if needed (timer wheel). Everything being local to the event loop
thread makes the logic simple as no locking or synchronization is necessary.
2019-01-09 11:46:24 -07:00
Alpar Torok 6a5f3f05f4 Fix build on Fips
testing convetions need to be disabled if the test task is for fips.
2019-01-09 19:27:01 +02:00
Martijn van Groningen 9122585359
[CCR] Added more logging. 2019-01-09 12:17:47 +01:00
Tanguy Leroux f1f5d834c3 Merge branch 'close-index-api-refactoring' 2019-01-09 11:48:57 +01:00
David Roberts e0ce73713f
[ML] Stop datafeeds running when their jobs are stale (#37227)
We already had logic to stop datafeeds running against
jobs that were OPENING, but a job that relocates from
one node to another while OPENED stays OPENED, and this
could cause the datafeed to fail when it sent data to
the OPENED job on its new node before it had a
corresponding autodetect process.

This change extends the check to stop datafeeds running
when their job is OPENING _or_ stale (i.e. has not had
its status reset since relocating to a different node).

Relates #36810
2019-01-09 10:42:47 +00:00
Tanguy Leroux 096a83183e Merge branch 'master' into close-index-api-refactoring 2019-01-09 10:52:46 +01:00
David Roberts f14cff2102
[TEST] Ensure interrupted flag reset after test that sets it (#37230)
Test fix to stop a problem in one test leaking into a different
test and causing that other test to spuriously fail.
2019-01-09 08:51:00 +00:00
Tanguy Leroux 7f6fe14b66 Merge branch 'master' into close-index-api-refactoring 2019-01-09 09:26:05 +01:00
Ioannis Kakavas 9049263c2c
Handle malformed license signatures (#37137)
This commit adds a more user friendly error message when a license
signature is malformed/truncated in a way that it cannot be
meaningfully parsed.
2019-01-09 07:29:22 +02:00
Ioannis Kakavas 2a79c468f8 Ensure that ActionListener is called exactly once
This bug was introduced in #36893 and had the effect that
execution would continue after calling onFailure on the the
listener in checkIfTokenIsValid in the case that the token is
expired. In a case of many consecutive requests this could lead to
the unwelcome side effect of an expired access token producing a
successful authentication response.
2019-01-09 07:23:35 +02:00
Marios Trivyzas 5f2fbedd8c
SQL: Replace String.format() with LoggerMessageFormat.format() (#37216)
Fixes: #36532
2019-01-08 23:56:00 +02:00
Martijn van Groningen d6608caf55
Muted rolling upgrade tests.
Relates to #37231
2019-01-08 16:52:22 +01:00
Jay Modi 1514bbcdde
Security: propagate auth result to listeners (#36900)
After #30794, our caching realms limit each principal to a single auth
attempt at a time. This prevents hammering of external servers but can
cause a significant performance hit when requests need to go through a
realm that takes a long time to attempt to authenticate in order to get
to the realm that actually authenticates. In order to address this,
this change will propagate failed results to listeners if they use the
same set of credentials that the authentication attempt used. This does
prevent these stalled requests from retrying the authentication attempt
but the implementation does allow for new requests to retry the
attempt.
2019-01-08 08:52:12 -07:00
Alpar Torok 6344e9a3ce
Testing conventions: add support for checking base classes (#36650) 2019-01-08 13:39:03 +02:00
Tanguy Leroux 6e852dfa7c Merge branch 'master' into close-index-api-refactoring 2019-01-08 11:28:51 +01:00
Martijn van Groningen c980cc12df
Added CCR rolling upgrade tests (#36648)
Added CCR rolling upgrade tests.
2019-01-08 11:05:18 +01:00
Tanguy Leroux d70ebfd1d6 Merge branch 'master' into close-index-api-refactoring 2019-01-08 09:17:48 +01:00
Andrei Stefan 3fad9d25f6
SQL: fix COUNT DISTINCT filtering (#37176)
* Use `_count` aggregation value only for not-DISTINCT COUNT function calls
* COUNT DISTINCT will use the _exact_ version of a field (the `keyword` sub-field for example), if there is one
2019-01-08 08:47:35 +02:00
Jason Tedor c8c596cead
Introduce retention lease expiration (#37195)
This commit implements a straightforward approach to retention lease
expiration. Namely, we inspect which leases are expired when obtaining
the current leases through the replication tracker. At that moment, we
clean the map that persists the retention leases in memory.
2019-01-07 22:03:52 -08:00
Benjamin Trent 6b376a1ff4
ML: fix delayed data annotations on secured cluster (#37193)
* changing executing context for writing annotation

* adjusting user

* removing unused import
2019-01-07 15:18:38 -06:00
Tanguy Leroux 97bf4d7176 Merge branch 'master' into close-index-api-refactoring 2019-01-07 18:38:27 +01:00
Benjamin Trent 1780ced82d
ML: changing JobResultsProvider.getForecastRequestStats to support > 1 index (#37157)
* ML: changing JobResultsProvider.getForecastRequestStats to support more than one index

* moving to use idsQuery()
2019-01-07 10:58:55 -06:00
Christophe Bismuth 9602d794c6 Separate out validation of groups of settings (#34184)
Today, a setting can declare that its validity depends on the values of other
related settings. However, the validity of a setting is not always checked
against the correct values of its dependent settings because those settings'
correct values may not be available when the validator runs.

This commit separates the validation of a settings updates into two phases,
with separate methods on the `Setting.Validator` interface. In the first phase
the setting's validity is checked in isolation, and in the second phase it is
checked again against the values of its related settings. Most settings only
use the first phase, and only the few settings with dependencies make use of
the second phase.
2019-01-07 16:12:58 +00:00
Jason Tedor c0f8c89172
Introduce shard history retention leases (#37167)
This commit is the first in a series which will culminate with
fully-functional shard history retention leases.

Shard history retention leases are aimed at preventing shard history
consumers from having to fallback to expensive file copy operations if
shard history is not available from a certain point. These consumers
include following indices in cross-cluster replication, and local shard
recoveries. A future consumer will be the changes API.

Further, index lifecycle management requires coordinating with some of
these consumers otherwise it could remove the source before all
consumers have finished reading all operations. The notion of shard
history retention leases that we are introducing here will also be used
to address this problem.

Shard history retention leases are a property of the replication group
managed under the authority of the primary. A shard history retention
lease is a combination of an identifier, a retaining sequence number, a
timestamp indicating when the lease was acquired or renewed, and a
string indicating the source of the lease. Being leases they have a
limited lifespan that will expire if not renewed. The idea of these
leases is that all operations above the minimum of all retaining
sequence numbers will be retained during merges (which would otherwise
clear away operations that are soft deleted). These leases will be
periodically persisted to Lucene and restored during recovery, and
broadcast to replicas under certain circumstances.

This commit is merely putting the basics in place. This first commit
only introduces the concept and integrates their use with the soft
delete retention policy. We add some tests to demonstrate the basic
management is correct, and that the soft delete policy is correctly
influenced by the existence of any retention leases. We make no effort
in this commit to implement any of the following:
 - timestamps
 - expiration
 - persistence to and recovery from Lucene
 - handoff during primary relocation
 - sharing retention leases with replicas
 - exposing leases in shard-level statistics
 - integration with cross-cluster replication

These will occur individually in follow-up commits.
2019-01-07 07:43:57 -08:00
Alpar Torok a7c3d5842a
Split third party audit exclusions by type (#36763) 2019-01-07 17:24:19 +02:00
Josh Soref edb48321ba [DOCS] Various spelling corrections (#37046) 2019-01-07 14:44:12 +01:00