Commit Graph

2634 Commits

Author SHA1 Message Date
Hendrik Muhs f4e56118c2 [ML] generate unique doc ids for data frame (#40382)
create and use unique, deterministic document ids based on the grouping values.

This is a pre-requisite for updating documents as well as preventing duplicates after a hard failure during indexing.
2019-03-27 08:27:05 +01:00
Julie Tibshirani 25954a8dd3 Stop clearing all watches in watcher integration tests. (#39724) 2019-03-26 13:14:33 -07:00
alex101101 fb8ad0cf30 Add a soft limit to the field name length (#40309)
Adds an optional limit to the length of field names, throws an IllegalArgumentException if the limit is breached. 
Closes #33651
2019-03-26 17:58:32 +01:00
David Kyle 1354696db9
[ML] Data Frame HLRC Get Stats API (#40443) 2019-03-26 11:17:13 +00:00
Costin Leau 7234d78747 SQL: Fix classpath discovery on Java 10+ (#40420)
(cherry picked from commit 2cef233cb34ee80d8ed9cd014cea76ea5096d206)
2019-03-26 08:16:37 +02:00
Ed Savage c20ea9a2dd [ML][TEST] Fix failing test testPersistJobOnGracefulShutdown_givenTimeAdvancedAfterNoNewData (#40363)
Ensure that there is at least a 1s delay between the time that state
is persisted by each of the two jobs in the test.

Model snapshot IDs use the current time in epoch seconds to
distinguish themselves, hence snapshots will be overwritten
by another if it occurs in the same 1s window.

Closes #40347
2019-03-25 17:55:10 +00:00
Costin Leau 61f49af497 SQL: Spec tests now use classpath discovery (#40388)
To avoid having to specify each spec by hand (which can miss specs to be
added), the test infrastructure now performs classpath discovery so that
each spec added, is automatically considered.

Relates #40358

(cherry picked from commit d0f60b4425c731509aa8ca765d55f563f866ef90)
2019-03-25 15:22:52 +02:00
Benjamin Trent 7b4f964708
[ML] make source and dest objects in the transform config (#40337) (#40396)
* [ML] make source and dest objects in the transform config

* addressing PR comments

* Fixing compilation post merge

* adding comment for Arrays.hashCode

* addressing changes for moving dest to object

* fixing data_frame yml tests

* fixing API test
2019-03-25 07:16:41 -05:00
Hendrik Muhs 38afc9f27d refresh audit index before searching (#40401)
refresh the audit index before searching
2019-03-25 11:57:57 +01:00
Nhat Nguyen b9f96a8e1f
Expose external refreshes through the stats API (#38643)
Right now, the stats API only provides refresh metrics regarding
internal refreshes. This isn't very useful and somewhat misleading for
cluster administrators since the internal refreshes are not indicative
of documents being available for search.

In this PR I added a new metric for collecting external refreshes as
they occur and exposing them through the stats API. Now, calling an
endpoint for stats will yield external refresh metrics as well.

Relates #36712
2019-03-24 22:21:00 -04:00
Benjamin Trent a30bf27b2f
[ML] add auditor to data frame plugin (#40012) (#40394)
* [Data Frame] add auditor

* Adjusting Level, Auditor, and message to address pr comments

* Addressing PR comments
2019-03-23 18:56:44 -05:00
Benjamin Trent 2dd879abac
[ML] adds support for non-numeric mapped types (#40220) (#40380)
* [ML] adds support for non-numeric mapped types and mapping overrides

* correcting hlrc compilation issues after merge

* removing mapping_override option

* clearing up unnecessary changes
2019-03-23 14:04:14 -05:00
Benjamin Trent 88f510ffc2
[ML] making test more determinate (#40374) (#40381)
* [ML] making test more determinate

* unmuting test
2019-03-23 12:15:37 -05:00
Marios Trivyzas 143db10980
SQL: Fix issue timezone issues with JDBC getDate/getTime (#40360)
Previously, `getDate(int columnIdx)/getDate(String columnLabel)` and
were using legacy`java.util.Calendar` instead of the the `java.time.*`
classes to reset to the start of day. This resulted in different results 
for certain timestamps and timezones when calling
`getDate(col)` vs`getObject(col, java.sql.Date)`

Now only the methods (that must be implemented due to the JDBC spec)
`getDate(int columnIdx, Calendar cal)/getDate(String columnLabel, Calendar cal)`
are still using the `java.util.Calendar` for those conversion.

The same change was applied to
`getTime(int columnIdx)/getTime(String columnLabel)`
and
`getTimestamp(int columnIdx)/getTimestamp(String columnLabel)`

Fixes: #40289

(cherry picked from commit 44560671f18397e0c58e3647732880fcb73a5034)
2019-03-23 17:01:08 +01:00
Jason Tedor 10bbb082a4
Only run retention lease actions on active primary (#40386)
In some cases, a request to perform a retention lease action can arrive
on a primary shard before it is active. In this case, the primary shard
would not yet be in primary mode, tripping an assertion in the
replication tracker. Instead, we should not attempt to perform such
actions on an initializing shard. This commit addresses this by not
returning the primary shard in the single shard iterator if the primary
shard is not yet active.
2019-03-23 09:39:39 -04:00
Marios Trivyzas 17b8b54d5e
SQL: Fix metric aggs on date/time to not return double (#40377)
Previously metric aggregations on date fields would return a double
which caused errors when trying to apply scalar functions on top, e.g.:
```
SELECT YEAR(MAX(date)) FROM test
```

Fixes: #40376

(cherry-picked from commit 41d0a038467fbdbbf67fd9bfdf27623451cae63a)
2019-03-23 14:13:38 +01:00
Costin Leau 558adc0f28 SQL: Add missing handling of IP field in JDBC (#40384)
Fix #40358

(cherry picked from commit ee286fa4893817637c05d72b93b254b36efc0dae)
(cherry picked from commit d2296249499e31bd512390ac3d20bc38009612b3)
2019-03-23 12:58:10 +02:00
Andrei Stefan 150d1332cf SQL: Fix RLIKE bug and improve testing for RLIKE statement (#40354)
* Refactor RegexMatch to support both LIKE and RLIKE
* Add integration tests for RLIKE
* Polish the rest of tests

(cherry picked from commit 7562d6eeeb77c04794002649fe726f4b3a9a398b)
2019-03-23 06:37:53 +02:00
Costin Leau 87d3d16c5a SQL: JLine upgrade and polishing (#40321)
Upgrade JLine to 3.10.0
Switch to using JLine granular jars instead of the uber-one
Remove Jansi dependency (due to errors in closing streams)
Pin JNA dependency to our own artifact

Fix #40239

(cherry picked from commit 9afa65fa80111f3b68c13373c7b6db13c11dde31)
2019-03-22 23:55:51 +02:00
Costin Leau 496070fda6 SQL: CAST supports both SQL and ES types (#40365)
Extend CAST to support all data types notations (whether SQL or ES
specific)

Fix #40282

(cherry picked from commit eb2ee8a344da946920598839a5db76c8bb9bc3fe)
2019-03-22 23:55:51 +02:00
Benjamin Trent 05460cca58
Muting test testExtractIndexCheckpointsInconsistentGlobalCheckpoints (#40370) 2019-03-22 13:25:48 -05:00
Hendrik Muhs 5a0c32833e Add a checkpoint service for data frame transforms (#39836)
Add a checkpoint service for data frame transforms, which allows to ask for a checkpoint of the
source. In future these checkpoints will be stored in the internal index to

 - detect upstream changes
 - updating the data frame without a full re-run
 - allow data frame clients to checkpoint themselves
2019-03-22 10:25:30 +01:00
David Turner 1265a15b75 Mute testPersistJobOnGracefulShutdown_givenTimeAdvancedAfterNoNewData 2019-03-22 08:46:51 +00:00
Costin Leau 980ee14f57 DOC: Expand section on ORDER BY aggs (#40332)
(cherry picked from commit 99d2f6fc9864ab972259ef5692129ab49e4a7ab8)
2019-03-22 10:04:52 +02:00
Andrei Stefan f9ab9afcc1 Extract the first value in an array when looking at the returned values (#40318)
(cherry picked from commit faf02e0f42a101985619abc0d30753851605e01d)
2019-03-22 06:43:37 +02:00
Andrei Stefan 35fe05308e SQL: rewrite ROUND and TRUNCATE functions with a different optional parameter handling method (#40242)
* Rewrite Round and Truncate functions to have a slightly different
approach to handling the optional parameter in the constructor. Until now
the optional parameter was considered 0 if the value was missing and the
constructor was filling in this value. The current solution is to have
the optional parameter as null right until the actual calculation is done.

(cherry picked from commit 3e314f8fa4cb322e67949e80857561ce51268726)
2019-03-22 06:43:37 +02:00
Nhat Nguyen 0e12065b54 Relax max_seq_no_of_updates assertion in follow tests
If there's a failover on the follower, then its max_seq_no_of_updates is
bootstrapped from its max_seq_no which might be higher than the
max_seq_no_of_updates of the leader. We need to relax this check.

Relates #40249
2019-03-21 19:41:55 -04:00
Ed Savage 23d5f7babf
[ML] Add integration tests to check persistence (#40272) (#40315)
Additional checks to exercise the behaviour of
persistence on graceful close of an anomaly job.

Related to elastic/ml-cpp#393
Backports #40272
2019-03-21 17:01:10 +00:00
Lisa Cawley e6799849d1 [DOCS] Adds placeholder for start and stop data frame transform APIs (#40278) 2019-03-21 09:39:10 -07:00
Lisa Cawley caa0129d44 [DOCS] Adds placeholder for create and delete data frame transform APIs (#40233) 2019-03-21 09:13:50 -07:00
lcawl 0e712d476e Adds URL for preview data frame transforms 2019-03-21 08:28:23 -07:00
Lisa Cawley ff2bcc9d11 [DOCS] Adds placeholder for get data frame transform APIs (#40283) 2019-03-21 07:57:01 -07:00
Albert Zaharovits 2f80b7304f
Refactor Token Service (#39808)
This refactoring is in the context of the work related to moving security
tokens to a new index. In that regard, the Token Service has to work with
token documents stored in any of the two indices, albeit only as a transient
situation. I reckoned the added complexity as unmanageable,
hence this refactoring.

This is incomplete, as it fails to address the goal of minimizing .security accesses,
but I have stopped because otherwise it would've become a full blown rewrite
(if not already). I will follow-up with more targeted PRs.

In addition to being a true refactoring, some 400 errors moved to 500. Furthermore,
more stringed validation of various return result, has been implemented, notably the
one of the token document creation.
2019-03-21 15:55:56 +02:00
Costin Leau dd41ce0763 SQL: Preserve original source for cast/convert function (#40271)
Improve rule for pruning cast to preserve the original source
Fix #40239

(cherry picked from commit 7591cb1a1577320b3aec2ec557b0f881b6af744f)
2019-03-21 14:08:15 +02:00
Jason Tedor 1e6941b138
Reduce retention lease sync intervals (#40302)
This commit adjusts the frequency with which CCR renews retention leases
and with which primaries sync retention leases to replicas. This helps
Lucene reclaim soft-deleted documents more aggressively, which we have
found in some use-cases can help improve performance, and either way
will help keep disk space under more control.
2019-03-21 07:37:44 -04:00
Andrei Stefan 1a5ff05870 SQL: fix LIKE function equality by considering its pattern as well (#40260)
* Define a equals method for Like function so that the pattern used
is considered in the equality check. Whenever the functions are resolved
this check should be used.

(cherry picked from commit 4e5d5af58a140573b8ee19d57c7839db7b779e3b)
2019-03-21 11:44:57 +02:00
David Kyle a4cb92a300
[ML] Data Frame HLRC Preview API (#40258) 2019-03-21 09:38:27 +00:00
Andrei Stefan d485be631b Moving tests in locale-aware test file (#40254)
(cherry picked from commit 9beb31fd3c5a8323cb08cc524f1a2268e9c72c24)
2019-03-21 10:57:37 +02:00
Yogesh Gaikwad 5d30df5a60
Fix so non super users can also create API keys (#40028) (#40286)
When creating API keys we check for if API key with
the same key name already exists and fail the request if it does.
The check should have been performed with XPackSecurityUser
instead of the authenticated user. This caused the request to fail
in case of the non-super user trying to create an API key.
This commit fixes by executing search action with SECURITY_ORIGIN
so it can be executed with XPackSecurityUser.
Also fixed the Rest test to avoid using a user with `super_user` role.

Closes #40029
2019-03-21 15:53:25 +11:00
Marios Trivyzas e1eb683c51 SQL: Fix issue with getting DATE type in JDBC (#40207)
Previously, calling getDate()/getTime()/getTimestamp() and getObject()
with the corresponding java.sql class on a column of SQL DATE type from
the JDBC result set would throw an Exception.
2019-03-21 01:48:06 +01:00
Benjamin Trent 5ae43855fc
[ML] Refactor GET Transforms API (#40015) (#40269)
* [Data Frame] Refactor GET Transforms API:

* Add pagination
* comma delimited list expression support GET transforms
* Flag troublesome internal code for future refactor

* Removing `allow_no_transforms` param, ratcheting down pageparam option

* Changing  DataFrameFeatureSet#usage to not get all configs

* Intermediate commit

* Writing test for batch data gatherer

* Removing unused import

* removing bad println used for debugging

* Updating BatchedDataIterator comments and query

* addressing pr comments

* disallow null scrollId to cause stackoverflow
2019-03-20 19:14:50 -05:00
Marios Trivyzas f37f2b5d39
SQL: Fix issue with optimization on queries with ORDER BY/LIMIT (#40256)
Previously, when a trival plain `SELECT` or a trivial `SELECT` with
aggregations has also an `ORDER BY` or a `LIMIT` or both, then the
optimization to convert it to a `LocalRelation` was skipped resulting
in exception thrown. E.g.::
```
SELECT 'foo' FROM test LIMIT 10
```
or
```
SELECT 'foo' FROM test GROUP BY 1 ORDER BY 1
```

Fixes: #40211
2019-03-20 23:52:35 +01:00
Marios Trivyzas bc4c8e53c5 SQL: Fix issue with date columns returned always in UTC (#40163)
When selecting columns of ES type `date` (SQL's DATETIME) the
`FieldHitExtractor` was not using the timezone of the client session
but always resorted to UTC. The same behaviour (UTC only) was
encountered also for grouping keys (`CompositeKeyExtractor`) and
for First/Last functions on dates (`TopHitsAggExtractor`).

Fixes: #40152
2019-03-20 20:32:33 +01:00
Like 6f64267626 Make setting index.translog.sync_interval be dynamic (#37382)
Currently, we cannot update index setting index.translog.sync_interval if index is open, because it's
not dynamic which can be updated for closed index only.

Closes #32763
2019-03-20 17:12:45 +01:00
Henning Andersen 4c2a8638ca Cascading primary failure lead to MSU too low (#40249)
If a replica were first reset due to one primary failover and then
promoted (before resync completes), its MSU would not include changes
since global checkpoint, leading to errors during translog replay.

Fixed by re-initializing MSU before restoring local history.
2019-03-20 14:00:43 +01:00
Gordon Brown 85bb5a7f46
Only count some fields types for deprecation check (#40166)
Some field types are not used for queries which use auto-expansion, in
particular, `binary`, `geo_point`, and `geo_shape`. This was causing the
count returned by the deprecation check and the count returned by the
query-time deprecation warning to be misaligned for indices with fields
of those types, with the count returned by the deprecation check being
larger.
2019-03-19 10:52:35 -06:00
David Kyle 387648065d
[ML] Data Frame HLRC start & stop APIs (#40197) 2019-03-19 13:30:01 +00:00
Alexander Reelsen c46dd6ad08 Replace java mail with jakarta mail (#40088)
The eclipse foundation has taken over the javax mail dependency, which
resulted in a naming change of the dependency.
2019-03-19 09:56:44 +01:00
Yannick Welsch 1d8b5fc658 Fail command-line client's auto-URL detection with helpful message (#40151)
The setup-passwords tool gives cryptic messages in case where custom discovery providers are
used (see #33580). As the URL auto-detection logic should be seen as best effort, this commit
improves the exception message to make it clearer what needs to be done to fix the issue.

Relates #33580
2019-03-19 09:04:14 +01:00
Jason Tedor f88e4181ca
Enable reading auto-follow patterns from x-content (#40130)
This named writable was never registered, so it means that we could not
read auto-follow patterns that were registered in the cluster
state. This causes them to be lost on restarts, a bad bug. This commit
addresses this by registering this named writable, and we add a basic
CCR restart test to ensure that CCR keeps functioning properly when the
follower is restarted.
2019-03-18 21:48:44 -04:00
Gordon Brown c8a4a7fc9d
Remove Migration Upgrade and Assistance APIs (#40075)
The Migration Assistance API has been functionally replaced by the
Deprecation Info API, and the Migration Upgrade API is not used for the
transition from ES 6.x to 7.x, and does not need to be kept around to
repair indices that were not properly upgraded before upgrading the
cluster, as was the case in 6.
2019-03-18 13:46:56 -06:00
Nhat Nguyen 38e9522218 Remove wait for cluster state step in peer recovery (#40004)
We introduced WAIT_CLUSTERSTATE action in #19287 (5.0), but then stopped
using it since #25692 (6.0). This change removes that action and related
code in 7.x and 8.0.

Relates #19287
Relates #25692
2019-03-18 15:17:21 -04:00
Jason Tedor 5be12e0999
Safe publication of AutoFollowCoordinator (#40153)
We were leaking a reference to an AutoFollowCoordinator during
construction, violating safe publication according to the JLS
specification. This commit addresses this by waiting to register
AutoFollowCoordinator with the ClusterApplierService after the
AutoFollowCoordinator is fully constructed. We also remove ourselves as
a listener when stopping.
2019-03-18 10:13:41 -04:00
Andrei Stefan 791814bb47 SQL: fix incorrect ordering of groupings (GROUP BY) based on orderings (ORDER BY) (#40087)
* Take into consideration aliases that can be used as aggregates
and in the ORDER BY element so that the groupings are re-ordered inside
the composite aggregation according to the ORDER BY ordering.

(cherry picked from commit 110c0b90b9cf2e9344ab3f412cfa8f8cd94ad71f)
2019-03-18 15:37:45 +02:00
Costin Leau 076a68007c SQL: Add multi_value_field_leniency inside FieldHitExtractor (#40113)
For cases where fields can have multi values, allow the behavior to be
customized through a dedicated configuration field.
By default this will be enabled on the drivers so that existing datasets
work instead of throwing an exception.
For regular SQL usage, the behavior is false so that the user is aware
of the underlying data.

Fix #39700

(cherry picked from commit 2b351571961f172fd59290ee079126bbd081ceaf)
2019-03-18 14:56:03 +02:00
Jason Tedor b8ad337234
Stop auto-followers on shutdown (#40124)
When shutting down a node, auto-followers will keep trying to run. This
is happening even as transport services and other components are being
closed. In some cases, this can lead to a stack overflow as we rapidly
try to check the license state of the remote cluster, can not because
the transport service is shutdown, and then immeidately retry
again. This can happen faster than the shutdown, and we die with stack
overflow. This commit adds a stop command to auto-followers so that this
retry loop occurs at most once on shutdown.
2019-03-18 07:25:31 -04:00
Ioannis Kakavas 3b9a884f92 Throw an exception when unable to read Certificate (#40092)
With SUN security provider, a CertificateException is thrown when
attempting to parse a Certificate from a PEM file on disk with
`sun.security.provider.X509Provider#parseX509orPKCS7Cert`

When using the BouncyCastle Security provider (as we do in fips
tests) the parsing happens in
CertificateFactory#engineGenerateCertificates which doesn't throw
an exception but returns an empty list.

In order to have a consistent behavior, this change makes it so
that we throw a CertificateException when attempting to read
a PEM file from disk and failing to do so in either Security
Provider

Resolves: #39580
2019-03-18 08:46:49 +02:00
Albert Zaharovits 124de8d938 Un-hardcode SecurityIndexManager to handle generic indices (#40064)
`SecurityIndexManager` is hardcoded to handle only the `.security`-`.security-7` alias-index pair.
This commit removes the hardcoded bits, so that the `SecurityIndexManager` can be reused
for other indices, such as the planned security tokens index (`.security-tokens-7`).
2019-03-17 14:46:16 +02:00
Albert Zaharovits 1b75ee0bd7 AuditTrail correctly handle ReplicatedWriteRequest (#39925)
This fix deduplicates index names in `BulkShardRequests` and only audits
the specific resolved index for every comprising `BulkItemRequest`.
2019-03-17 13:05:26 +02:00
David Roberts 64028f3d8f Mute JobResultsProviderIT.testMultipleSimultaneousJobCreations
Due to https://github.com/elastic/elasticsearch/issues/40134
2019-03-17 07:50:08 +00:00
Benjamin Trent 28729eb54c
[ML] fixing sort order (#40119) (#40123) 2019-03-16 17:14:07 -05:00
Jason Tedor 0824eceacf
Add log message for auto-follower timeout
When an auto-follower coordinator times out waiting for the remote
cluster state, we do not log any indication of this. While this is
expected behavior in quiet deployments, it is still useful to see this
information for tracing the behavior of the auto-follow
coordinator. This commit adds a trace log message indicating that the
timeout.
2019-03-16 10:46:20 -04:00
Jason Tedor 86d1d03c37
Remove cluster state size (#40109)
This commit removes the cluster state size field from the cluster state
response, and drops the backwards compatibility layer added in 6.7.0 to
continue to support this field. As calculation of this field was
expensive and had dubious value, we have elected to remove this field.
2019-03-15 17:16:25 -04:00
Igor Motov a019af7690 SQL: Refactor Literals serialization method (#40058)
Since other classes besides intervals can be serialized as part of
the Cursor, the getNamedWritables method should be moved from Intervals
to a more generic class Literals.

Relates to #39973
2019-03-15 14:30:28 -04:00
David Kyle 4eb3683d65 Mute CcrRetentionLeaseIT tests (#40090) 2019-03-15 15:05:47 +00:00
David Kyle 09809bc91b [ML] Avoid assertions on empty Optional in DF usage test (#40043)
Refactor the usage class to make testing simpler
2019-03-15 12:18:29 +00:00
David Roberts 8d01b11918 [ML] Fix race condition when creating multiple jobs (#40049)
If multiple jobs are created together and the anomaly
results index does not exist then some of the jobs could
fail to update the mappings of the results index. This
lead them to fail to write their results correctly later.

Although this scenario sounds rare, it is exactly what
happens if the user creates their first jobs using the
Nginx module in the ML UI.

This change fixes the problem by updating the mappings
of the results index if it is found to exist during a
creation attempt.

Fixes #38785
2019-03-15 10:18:03 +00:00
David Kyle 78a9754318 Mute test NetworkDisruptionIT.testJobRelocation
Relates to #39858
2019-03-15 10:06:31 +00:00
Costin Leau 3960374a6f SQL: Introduce MAD (MedianAbsoluteDeviation) aggregation (#40048)
Add Median Absolute Deviation aggregation

Fix #39597

(cherry picked from commit 4f09613942a9249d06c74da64ad7e6f362e97f56)
2019-03-15 11:45:15 +02:00
Jake Landis e9fa7767ec
Fix test which still uses default type (#39997)
org.elasticsearch.xpack.monitoring.action.MonitoringBulkRequestTests#testAddRequestContent
can still randomly use a defaultType for monitoring. The defaultType
support has been removed as of PR #39888. Prior to its's removal it
would default the type if one is not specified. The _type on the monitoring
bulk end point is currently required, though it is not used as the final index type
(which defaultType would have).

Closes #39980
2019-03-14 10:37:51 -05:00
Jason Tedor d02bca1314
Upgrade the bouncycastle dependency to 1.61 (#40017)
This commit upgrades the bouncycastle dependency from 1.59 to 1.61.
2019-03-14 08:54:47 -04:00
Marios Trivyzas 4e9657f93f SQL: Fix bug with JDBC timezone setting and DATE type (#39978)
Previously, JDBC's REST call to the server was always sending UTC
instead of the timezone passed through connection string/properties.

Moreover the conversion to java.sql.Date was problematic as a
calculation on the epoch millis was used to set the time to 00:00:00.000
and the timezone info was lost. This caused the resulting java.sql.Date
object which is always using the JVM's timezone (no matter what timezone
setting is used in the connection string/properties) to be wrongly created.

Fixes: #39915
2019-03-14 13:41:53 +01:00
Yogesh Gaikwad 59201915db Mute DataFrameFeatureSetTests#testUsage test (#40023) 2019-03-14 10:39:14 +00:00
Andrei Stefan 4d1305b6df SQL: Extend the multi dot field notation extraction to lists of values (#39823)
(cherry picked from commit 300ae485dd08373727ca111a4d21276dd47d9a27)
2019-03-14 11:21:53 +02:00
Benjamin Trent 2016e23285
[ML] Refactor common utils out of ML plugin to XPack.Core (#39976) (#40009)
* [ML] Refactor common utils out of ML plugin to XPack.Core

* implementing GET filters with abstract transport

* removing added rest param

* adjusting how defaults can be supplied
2019-03-13 17:08:43 -05:00
Benjamin Trent 8c6ff5de31
[Data Frame] Refactor PUT transform to not create a task (#39934) (#40010)
* [Data Frame] Refactor PUT transform such that:

 * POST _start creates the task and starts it
 * GET transforms queries docs instead of tasks
 * POST _stop verifies the stored config exists before trying to stop
the task

* Addressing PR comments

* Refactoring DataFrameFeatureSet#usage, decreasing size returned getTransformConfigurations

* fixing failing usage test
2019-03-13 17:08:15 -05:00
Jim Ferenczi 7a7658707a
Upgrade to Lucene release 8.0.0 (#39998)
This commit upgrades to the GA release of Lucene 8

Closes #39640
2019-03-13 18:11:50 +01:00
Dimitris Athanasiou 79e414df86
[ML] Fix datafeed skipping first bucket after lookback when aggs are … (#39859) (#39958)
The problem here was that `DatafeedJob` was updating the last end time searched
based on the `now` even though when there are aggregations, the extactor will
only search up to the floor of `now` against the histogram interval.
This commit fixes the issue by using the end time as calculated by the extractor.

It also adds an integration test that uses aggregations. This test would fail
before this fix. Unfortunately the test is slow as we need to wait for the
datafeed to work in real time.

Closes #39842
2019-03-13 09:09:07 +02:00
Yogesh Gaikwad db04288d14
Add pre-upgrade check to test cluster routing allocation is enabled (#39340) (#39815)
When following the steps mentioned in upgrade guide
https://www.elastic.co/guide/en/elastic-stack/6.6/upgrading-elastic-stack.html
if we disable the cluster shard allocation but fail to enable it after
upgrading the nodes and plugins, the next step of upgrading internal
indices fails. As we did not check the bulk request response for reindexing,
we delete the old index assuming it has been created. This is fatal
as we cannot recover from this state.

This commit adds a pre-upgrade check to test the cluster shard
allocation setting and fail upgrade if it is disabled. In case there
are search or bulk failures then we remove the read-only block and
fail the upgrade index request.

Closes #39339
2019-03-13 09:23:32 +11:00
Michael Basnight 8c78fc096d More lenient socket binding in LDAP tests (#39864)
The LDAP tests attempt to bind all interfaces,
but if for some reason an interface can't be bound
the tests will stall until the suite times out.

This modifies the tests to be a bit more lenient and allow
some binding to fail so long as at least one succeeds.
This allows the test to continue even in more antagonistic
environments.
2019-03-12 12:00:49 -04:00
Gordon Brown da67c2f7f8
Deprecation check for indices with very large numbers of fields (#39869)
Indices with very large numbers of fields (>1024 by default) that do not
have index.query.default_field set will experience query failures in 7.0
for Simple Query String and Multi-Match queries. This deprecation check
issues a warning for indices of that size that do not have
index.query.default_field set.

This also adds a deprecation check for index templates with field counts
that would trigger these query failures as well.
2019-03-12 09:06:31 -06:00
Igor Motov 2f47e3d05a SQL: values in datetime script aggs should be treated as long (#39773)
When a query is translated into script terms agg where key has a date
type, it should generate a terms agg with value_type long instead of
date, otherwise the key gets formatted as a string, which confuses
hit extractor.

Fixes #37042
2019-03-11 17:41:12 -04:00
Jake Landis b0b0f66669
Remove types from internal monitoring templates and bump to api 7 (#39888) (#39926)
This commit removes the "doc" type from monitoring internal indexes.
The template still carries the "_doc" type since that is needed for
the internal representation.

This change impacts the following templates:
monitoring-alerts.json
monitoring-beats.json
monitoring-es.json
monitoring-kibana.json
monitoring-logstash.json

As part of the required changes, the system_api_version has been
bumped from "6" to "7" and support for version "2" has been dropped.

A new empty pipeline is now introduced for the version "7", and
the formerly empty "6" pipeline will now remove the type and re-direct
the request to the "7" index.

Additionally, to due to a difference in the internal representation
(which requires the inclusion of "_doc" type) and external representation
(which requires the exclusion of any type) a helper method is introduced
to help convert internal to external representation, and used by the
monitoring HTTP template exporter.

Relates #38637
2019-03-11 13:17:27 -05:00
Hendrik Muhs d30848eb23 change internal index to index doc_type, id, source and dest (#39913)
change internal index to index doc_type, id, source and dest
2019-03-11 17:35:34 +01:00
Costin Leau 92a87a45bf SQL: Wrap ZonedDateTime parameters inside scripts (#39911)
Painless allows ZonedDateTime objects to be passed natively to scripts
which creates problematic translate queries as the ZonedDateTime is
passed as a string instead.
Wrap this with a dedicated method to perform the conversion.

Fix #39877

(cherry picked from commit 4957cad5bda77257d10430ac102e93f5e062148a)
2019-03-11 17:44:03 +02:00
David Kyle 48788269b0
[ML] Correct small inconsistencies in ml APIs spec and docs (#39907) 2019-03-11 14:02:50 +00:00
Costin Leau a079b9fd6d SQL: ConstantProcessor can now handle NamedWriteable (#39876)
Enhance ConstantProcessor to properly serialize complex objects
(Intervals) that have their own custom serialization/deserialization
mechanism

Fix #39875

(cherry picked from commit ed8a1f9340673e69a44ea7a89679cadb4762e43d)
2019-03-11 12:49:23 +02:00
Martijn van Groningen 8925a2c6c2
Further tweak AutoFollowIT#testAutoFollowManyIndices:
* reduce the number of leader indices to be auto followed
* also check the number of follower indices being created
* also check the whether leader indices are marked as auto followed

Relates to #36761
2019-03-11 10:01:56 +01:00
Daniel Mitterdorfer 1bc31aca03
Mute CcrRetentionLeaseIT#testRetentionLeaseRenewalIsCancelledWhenFollowingIsPaused (#39897)
Relates #39509
2019-03-11 08:47:51 +01:00
Adrien Grand b841de2e38
Don't emit deprecation warnings on calls to the monitoring bulk API. (#39805) (#39838)
The monitoring bulk API accepts the same format as the bulk API, yet its concept
of types is different from "mapping types" and the deprecation warning is only
emitted as a side-effect of this API reusing the parsing logic of bulk requests.

This commit extracts the parsing logic from `_bulk` into its own class with a
new flag that allows to configure whether usage of `_type` should emit a warning
or not. Support for payloads has been removed for simplicity since they were
unused.

@jakelandis has a separate change that removes this notion of type from the
monitoring bulk API that we are considering bringing to 8.0.
2019-03-11 07:58:28 +01:00
Benjamin Trent 4da04616c9
[ML] refactoring lazy query and agg parsing (#39776) (#39881)
* [ML] refactoring lazy query and agg parsing

* Clean up and addressing PR comments

* removing unnecessary try/catch block

* removing bad call to logger

* removing unused import

* fixing bwc test failure due to serialization and config migrator test

* fixing style issues

* Adjusting DafafeedUpdate class serialization

* Adding todo for refactor in v8

* Making query non-optional so it does not write a boolean byte
2019-03-10 14:54:02 -05:00
Benjamin Trent 6c6549fc51
[Data-Frame] make the config be strictly parsed on _preview (#39713) (#39873)
* [Data-Frame] make the config be strictly parsed on _preview

* adding test to verify strictly parsing

* adjusting test after master merge
2019-03-09 14:03:57 -06:00
Jason Tedor 73a672b8dd
Fix Watcher stats class cast exception (#39821)
The watcher stats implementation tries to look at all queued watches
before preparing the result. We want to cast these to a
WatchExecutionTask to extract the context to prepare the stats for
queued watches. The problem is that not all tasks on the watcher queue
were WatchExecutionTask. This is because a manually executed watch was
not even at all wrapped in a WatchExecutionTask. Moreover, we were using
ExecutorService#submit(Runnable) which would wrap the Runnable in a
FutureTask<?>. This commit addresses this by using a WatchExecutionTask,
and also using ExecutorService#execute(Runnable) so that no wrapping
occurs. This will let us continue with the assumption that all queued
tasks are WatchExecutionTasks.
2019-03-08 14:52:10 -05:00
Ryan Ernst 465343f12a
Bundle java in distributions (#38013)
* Bundle java in distributions

Setting up a jdk is currently a required external step when installing
elasticsearch. This is particularly problematic for the rpm/deb packages
as installing a jdk in the same package installation command does not
guarantee any order, so must be done in separate steps. Additionally,
JAVA_HOME must be set and often causes problems in selecting a correct
jdk when, for example, the system java is an older unsupported version.

This commit bundles platform specific openjdks into each distribution.
In addition to eliminating the issues above, it also presents future
possible improvements like using jlink to build jdk images only
containing modules that elasticsearch uses.

closes #31845
2019-03-08 11:04:18 -08:00
Jake Landis e0abc3ce96
Remove the index type from internal watcher indexes (#39761) (#39853)
This commit removes the "doc" type from watcher internal indexes.
The template still carries the "_doc" type since that is needed for
the internal representation.

This impacts the .watches, .triggered-watches, and .watch-history indexes.

External consumers do not need any changes since all external calls
go through the _watcher API, and should not interact with the the .index directly.

Relates #38637
2019-03-08 12:46:36 -06:00
Albert Zaharovits 3c7fafd0cc Fix token invalidation when retries exhausted (#39799)
Fixes an error about missing to call the index invalidation listener
when retry count is exhausted but there are still tokens to be retried.
2019-03-08 20:18:59 +02:00
Jason Tedor 6675bafc49
Simplify CcrRetentionLeaseIT#testForgetFollower
This test was more complicated than necessary, where we were capturing
requests to prevent removal of retention leases, so that our forget
follower request could remove the retention leases instead. Instead, a
pause is enough to ensure that the retention leases are not re-added
after we remove them by the forget follower request. This commit
simplifies this test, and should remove some spurious failures.

Relates #39850
2019-03-08 12:33:17 -05:00
Jake Landis a8530c5531
Update logstash-management.json to use typeless template (#38653) (#39819)
This commit changes the type from "doc" to "_doc" for the
.logstash-management template. Since this is an internally
managed template it does not always go through the REST
layer for it's internal representation.  The internal
representation requires the default "_doc" type, which for
external templates is added in the REST layer.

Related #38637
2019-03-08 08:23:30 -06:00
David Kyle 6c2e831e94
[ML-Dataframe] Data frame config HLRC objects (#39825) 2019-03-08 12:18:55 +00:00
Martijn van Groningen 8666aa1ed2
unmuted and tweaked test
Relates to #36761
2019-03-08 12:43:23 +01:00
Hendrik Muhs 50d742320d store the doc type in the internal index (#39824)
store the doc type in the internal data frame index
2019-03-08 12:17:23 +01:00
Lee Hinman 8ec456b5df Maintain step order for ILM trace logging (#39522)
When trace logging is enabled we log the computed steps for a policy. This
commit makes sure that the steps that are logged are in the same order they will
be run when the policy executes. This makes it much easier to reason about the
policy if the move-to-step API is ever required in the future.
2019-03-07 11:37:58 -07:00
Hendrik Muhs 4d41310be5 [ML-DataFrame] fix wire serialization issues in data frame response objects (#39790)
fix wire serialization issues in data frame response objects
2019-03-07 19:28:44 +01:00
Tim Brooks 8043fefcf6
Log close_notify during handshake at debug level (#39715)
A TLS handshake requires exchanging multiple messages to initiate a
session. If one side decides to close during the handshake, it is
supposed to send a close_notify alert (similar to closing during
application data exchange). The java SSLEngine engine throws an
exception when this happens. We currently log this at the warn level if
trace logging is not enabled. This level is too high for a valid
scenario. Additionally it happens all the time in tests (quickly closing
and opened transports). This commit changes this to be logged at the
debug level if trace is not enabled. Additionally, it extracts the
transport security exception handling to a common class.
2019-03-07 09:52:18 -07:00
Jason Tedor 0250d554b6
Introduce forget follower API (#39718)
This commit introduces the forget follower API. This API is needed in cases that
unfollowing a following index fails to remove the shard history retention leases
on the leader index. This can happen explicitly through user action, or
implicitly through an index managed by ILM. When this occurs, history will be
retained longer than necessary. While the retention lease will eventually
expire, it can be expensive to allow history to persist for that long, and also
prevent ILM from performing actions like shrink on the leader index. As such, we
introduce an API to allow for manual removal of the shard history retention
leases in this case.
2019-03-07 11:08:45 -05:00
Ioannis Kakavas 6c19d872a0 Fix testRefreshingMultipleTimesWithinWindowSucceeds (#39701)
Previously all the threads were writing the received tokens to a
HashSet. In cases with many threads, sometimes (1 every ~25 tests)
calling size() on the HashSet returned 2 even though it seemed to
contain only one String and there was no evidence from logging that
threadSecurityClient.refreshToken() ever returned a different
access or refresh token.

This commit changes the test to use a ConcurrentHashMap instead,
checking that we only received one pair of access token/refresh token
eventually. It also adds a check so that we won't take into consideration
tokens that are returned after 30s, hence not in the concurrent refresh
time window.
2019-03-07 13:13:50 +02:00
Przemyslaw Gomulka 95bed81198
Change licence expiration date pattern Backport(#39681) #39781
Due to migration from joda to java.time licence expiration 'full date' format
has to use 4-char pattern (MMMM). Also since jdk9 the date with ROOT
locale will still return abbreviated days and month names.

closes #39136
backport #39681
2019-03-07 12:06:18 +01:00
Nhat Nguyen 83688ce2d4 Unmute testFollowIndexAndCloseNode
Resolved in #39584
2019-03-06 22:39:13 -05:00
Nhat Nguyen 3591da6ff8 Simplify FrozenEngine#getReader (#39539)
We really don’t need a try/finally in this method.
2019-03-06 15:30:55 -05:00
Albert Zaharovits fb1005fffc
Fix Token Service retry mechanism (#39639)
Fixes several errors of the token retry logic:

* not checking for backoff.hasNext() before calling backoff.next()
* checking for backoff.hasNext() without calling backoff.next()
* not preserving the context on the retry
* calling scheduleWithFixedDelay instead of schedule
2019-03-06 15:32:23 +02:00
David Roberts 5f8f91c03b
[ML] Use scaling thread pool and xpack.ml.max_open_jobs cluster-wide dynamic (#39736)
This change does the following:

1. Makes the per-node setting xpack.ml.max_open_jobs
   into a cluster-wide dynamic setting
2. Changes the job node selection to continue to use the
   per-node attributes storing the maximum number of open
   jobs if any node in the cluster is older than 7.1, and
   use the dynamic cluster-wide setting if all nodes are on
   7.1 or later
3. Changes the docs to reflect this
4. Changes the thread pools for native process communication
   from fixed size to scaling, to support the dynamic nature
   of xpack.ml.max_open_jobs
5. Renames the autodetect thread pool to the job comms
   thread pool to make clear that it will be used for other
   types of ML jobs (data frame analytics in particular)

Backport of #39320
2019-03-06 12:29:34 +00:00
David Turner 77dd711847 Tidy up GroupedActionListener (#39633)
Today the `GroupedActionListener` accepts a `defaults` parameter but all
callers pass an empty list. Also it is permitted to pass an empty group but
this is trappy because the delegated listener is never be called in that case.
This commit removes the `defaults` parameter and forbids an empty group.
2019-03-06 09:25:10 +00:00
Yogesh Gaikwad c91dcbd5ee
Types removal security index template (#39705) (#39728)
As we are moving to single type indices,
we need to address this change in security-related indexes.
To address this, we are
- updating index templates to use preferred type name `_doc`
- updating the API calls to use preferred type name `_doc`

Upgrade impact:-
In case of an upgrade from 6.x, the security index has type
`doc` and this will keep working as there is a single type and `_doc`
works as an alias to an existing type. The change is handled in the
`SecurityIndexManager` when we load mappings and settings from
the template. Previously, we used to do a `PutIndexTemplateRequest`
with the mapping source JSON with the type name. This has been
modified to remove the type name from the source.
So in the case of an upgrade, the `doc` type is updated
whereas for fresh installs `_doc` is updated. This happens as
backend handles `_doc` as an alias to the existing type name.

An optional step is to `reindex` security index and update the
type to `_doc`.

Since we do not support the security audit log index,
that template has been deleted.

Relates: #38637
2019-03-06 18:53:59 +11:00
Jason Tedor 75a0d4f470
Rename retention lease setting (#39719)
This commit renames the retention lease setting
index.soft_deletes.retention.lease so that it is under the namespace
index.soft_deletes.retention_lease. As such, we rename the setting to
index.soft_deletes.retention_lease.period.
2019-03-05 22:04:45 -05:00
Gordon Brown eb288a6f85
Use any index specified by .watches for Watcher (#39541) (#39708)
Previously, Watcher only attached its listener to indices that started
with the prefix `.watches`, which causes Watcher to silently fail to
schedule newly created Watches if the `.watches` alias is redirected to
an index that does not start with `.watches`.

Watcher now attaches the listener to all indices, so that Watcher can
respond to changes in which index has the `.watches` alias.

Also adjusts the tests to randomly use non-prefixed concrete indices 
for .watches and .triggered_watches.
2019-03-05 11:45:34 -07:00
Tomas Della Vedova fad52acf5a Removed incorrect ML YAML tests (#39400)
A client cannot know that a job_id is already taken, so
this test should not have been specified as a client test
2019-03-05 17:13:10 +00:00
David Roberts e94d32d069 Add roles and cluster privileges for data frame transforms (#39661)
This change adds two new cluster privileges:

* manage_data_frame_transforms
* monitor_data_frame_transforms

And two new built-in roles:

* data_frame_transforms_admin
* data_frame_transforms_user

These permit access to the data frame transform endpoints.
(Index privileges are also required on the source and
destination indices for each data frame transform, but
since these indices are configurable they it is not
appropriate to grant them via built-in roles.)
2019-03-05 14:07:25 +00:00
Simon Willnauer d112c89041 Allow inclusion of unloaded segments in stats (#39512)
Today we have no chance to fetch actual segment stats for segments that
are currently unloaded. This is relevant in the case of frozen indices.
This allows to monitor how much memory a frozen index would use if it was
unfrozen.
2019-03-05 14:02:20 +01:00
Ioannis Kakavas 7ed9d52824
Support concurrent refresh of refresh tokens (#39647)
This is a backport of #39631

Co-authored-by: Jay Modi jaymode@users.noreply.github.com

This change adds support for the concurrent refresh of access
tokens as described in #36872
In short it allows subsequent client requests to refresh the same token that
come within a predefined window of 60 seconds to be handled as duplicates
of the original one and thus receive the same response with the same newly
issued access token and refresh token.
In order to support that, two new fields are added in the token document. One
contains the instant (in epoqueMillis) when a given refresh token is refreshed
and one that contains a pointer to the token document that stores the new
refresh token and access token that was created by the original refresh.
A side effect of this change, that was however also a intended enhancement
for the token service, is that we needed to stop encrypting the string
representation of the UserToken while serializing. ( It was necessary as we
correctly used a new IV for every time we encrypted a token in serialization, so
subsequent serializations of the same exact UserToken would produce
different access token strings)

This change also handles the serialization/deserialization BWC logic:

    In mixed clusters we keep creating tokens in the old format and
    consume only old format tokens
    In upgraded clusters, we start creating tokens in the new format but
    still remain able to consume old format tokens (that could have been
    created during the rolling upgrade and are still valid)
    When reading/writing TokensInvalidationResult objects, we take into
    consideration that pre 7.1.0 these contained an integer field that carried
    the attempt count

Resolves #36872
2019-03-05 14:55:59 +02:00
Albert Zaharovits e7dbfda5d3 Fix security index auto-create and state recovery race (#39582)
Previously, the security index could be wrongfully recreated. This might
happen if the index was interpreted as missing, as in the case of a fresh
install, but the index existed and the state did not yet recover.

This fix will return HTTP SERVICE_UNAVAILABLE (503) for requests that
try to write to the security index before the state has not been recovered yet.
2019-03-05 12:47:59 +02:00
Dimitris Athanasiou 5c023770d2 [ML] Disable security audit trail in native integ tests suite (#39683)
Investigating how to make DeleteExpiredDataIT faster, it was
revealed that the security audit trail threads were quite hot.
Disabling that seems to be helping quite a bit with making this
test faster. This commit also unmutes the test to see how it goes
with the audit trail disabled.

Relates #39658
Closes #39575
2019-03-05 12:43:15 +02:00
Nhat Nguyen af4918ebff Simplify AutoFollowCoordinator with GroupedListener (#39603)
This change simplifies AutoFollowCoordinator by replacing a combination
of AtomicArray and CountDown with GroupedActionListener.
2019-03-04 13:50:27 -05:00
Marios Trivyzas c72a7998f5
SQL: Don't allow inexact fields for MIN/MAX (#39563)
MIN/MAX on strings are supported and are implemented with
TopAggs FIRST/LAST respectively, but they cannot operate on
`text` fields without underlying `keyword` fields => inexact.

Follows: #39427
2019-03-04 15:35:11 +01:00
Martijn Laarman 52ecf18dc4
Index on rollup.rollup_search.json is a list (#39097) (#39653)
And not a string since it accepts comma separated list of indices.

(cherry picked from commit cf34d50b3a983b5fc0c9c7aa279cecd4aa10e28b)
2019-03-04 15:23:18 +01:00
Martijn Laarman c2a94aabbc
ilm.explain_lifecycle documents human again (#39113) (#39648)
This is already exposed as a `_common.json` global parameter.

(cherry picked from commit e84050c0307bb5d5cea8eacc6b63b34248a41a01)
2019-03-04 15:23:01 +01:00
Martijn Laarman 9788036857
metric on watcher stats is a list not an enum (#39114) (#39645)
`enum` is a single option from a known list of `options`
`list` is an array of unknown values
`flags` are multiple options from a list of known `options`.

We don't support the `flags` type but a `list` with `options` acts as one. This is already the case for other API's taking metric such as `node.stats.json`. 

watcher.stats behaves the same as other API's as `metrics` and as such accepts the following `GET _xpack/watcher/stats/queued_watches,current_watches`

(cherry picked from commit 4c00a025b8ac9b397b27c4ae2f799553d6499412)
2019-03-04 15:22:44 +01:00
Martijn Laarman 7c69fd9e44
parts documented as optional are actually required (#39122) (#39641)
(cherry picked from commit e0f728b44ad49e28477767b3ee783a07ddf4bb0d)
2019-03-04 15:22:26 +01:00
David Kyle a58145f9e6
[ML] Transition to typeless (mapping) APIs (#39573)
ML has historically used doc as the single mapping type but reindex in 7.x
will change the mapping to _doc. Switching to the typeless APIs handles 
case where the mapping type is either doc or _doc. This change removes
deprecated typed usages.
2019-03-04 13:52:05 +00:00
David Kyle c7a2910cc1
[Ml-Dataframe] Register Data Frame named writables and xcontents (#39635)
Register types in the Dataframe plugin
2019-03-04 11:48:03 +00:00
Yannick Welsch 0f65390c29 Do not mutate engine during planning step (#39571)
This cleans up the Engine implementation by separating the sequence number generation from the
planning step in the engine, to avoid for the planning step to have any side effects. This makes it
easier to see that every sequence number is properly accounted for.
2019-03-04 10:11:39 +01:00
Tim Vernum 834a88abf9 Mute failing test on FIPS JVM
Relates: #39580
Backport of: #39616
2019-03-04 12:57:51 +11:00
David Roberts 085ff38122 Mute DeleteExpiredDataIT.testDeleteExpiredData
Due to https://github.com/elastic/elasticsearch/issues/39575
2019-03-03 18:34:30 +00:00
Costin Leau e038ccef13 SQL: Fix merging of incompatible multi-fields (#39560)
Fix bug in IndexResolver that caused conflicts in multi-field types to
be ignored up (causing the query to fail later on due to mapping
conflicts).
The issue was caused by the multi-field which forced the parent creation
before checking its validity across mappings

Fix #39547

(cherry picked from commit 4e4fe289f90b9b5eae09072d54903701a3128696)
2019-03-02 10:30:02 +02:00
Costin Leau dfe81b260e SQL: Enable accurate hit tracking on demand (#39527)
Queries that require counting of all hits (COUNT(*) on implicit
group by), now enable accurate hit tracking.

Fix #37971

(cherry picked from commit 265b637cf6df08986a890b8b5daf012c2b0c1699)
2019-03-01 23:09:04 +02:00
Dimitris Athanasiou 8843832039 [ML] Shave off DeleteExpiredDataIT runtime (#39557)
This commit parallelizes some parts of the test
and its remove an unnecessary refresh call.
On my local machine it shaves off about 15 seconds
for a test execution time of ~64s (down from ~80s).
This test is still slow but progress over perfection.

Relates #37339
2019-03-01 19:10:00 +02:00
Tanguy Leroux 0c6b7cfb77 Revert "Support concurrent refresh of refresh tokens (#39559)"
This reverts commit e2599214e0.
2019-03-01 17:59:45 +01:00
Ioannis Kakavas e2599214e0
Support concurrent refresh of refresh tokens (#39559)
This is a backport of #38382

This change adds supports for the concurrent refresh of access
tokens as described in #36872
In short it allows subsequent client requests to refresh the same token that
come within a predefined window of 60 seconds to be handled as duplicates
of the original one and thus receive the same response with the same newly
issued access token and refresh token.
In order to support that, two new fields are added in the token document. One
contains the instant (in epoqueMillis) when a given refresh token is refreshed
and one that contains a pointer to the token document that stores the new
refresh token and access token that was created by the original refresh.
A side effect of this change, that was however also a intended enhancement
for the token service, is that we needed to stop encrypting the string
representation of the UserToken while serializing. ( It was necessary as we
correctly used a new IV for every time we encrypted a token in serialization, so
subsequent serializations of the same exact UserToken would produce
different access token strings)

This change also handles the serialization/deserialization BWC logic:

- In mixed clusters we keep creating tokens in the old format and
consume only old format tokens
- In upgraded clusters, we start creating tokens in the new format but
still remain able to consume old format tokens (that could have been
created during the rolling upgrade and are still valid)

Resolves #36872

Co-authored-by: Jay Modi jaymode@users.noreply.github.com
2019-03-01 16:00:07 +02:00
Tanguy Leroux e005eeb0b3
Backport support for replicating closed indices to 7.x (#39506)(#39499)
Backport support for replicating closed indices (#39499)
    
    Before this change, closed indexes were simply not replicated. It was therefore
    possible to close an index and then decommission a data node without knowing
    that this data node contained shards of the closed index, potentially leading to
    data loss. Shards of closed indices were not completely taken into account when
    balancing the shards within the cluster, or automatically replicated through shard
    copies, and they were not easily movable from node A to node B using APIs like
    Cluster Reroute without being fully reopened and closed again.
    
    This commit changes the logic executed when closing an index, so that its shards
    are not just removed and forgotten but are instead reinitialized and reallocated on
    data nodes using an engine implementation which does not allow searching or
     indexing, which has a low memory overhead (compared with searchable/indexable
    opened shards) and which allows shards to be recovered from peer or promoted
    as primaries when needed.
    
    This new closing logic is built on top of the new Close Index API introduced in
    6.7.0 (#37359). Some pre-closing sanity checks are executed on the shards before
    closing them, and closing an index on a 8.0 cluster will reinitialize the index shards
    and therefore impact the cluster health.
    
    Some APIs have been adapted to make them work with closed indices:
    - Cluster Health API
    - Cluster Reroute API
    - Cluster Allocation Explain API
    - Recovery API
    - Cat Indices
    - Cat Shards
    - Cat Health
    - Cat Recovery
    
    This commit contains all the following changes (most recent first):
    * c6c42a1 Adapt NoOpEngineTests after #39006
    * 3f9993d Wait for shards to be active after closing indices (#38854)
    * 5e7a428 Adapt the Cluster Health API to closed indices (#39364)
    * 3e61939 Adapt CloseFollowerIndexIT for replicated closed indices (#38767)
    * 71f5c34 Recover closed indices after a full cluster restart (#39249)
    * 4db7fd9 Adapt the Recovery API for closed indices (#38421)
    * 4fd1bb2 Adapt more tests suites to closed indices (#39186)
    * 0519016 Add replica to primary promotion test for closed indices (#39110)
    * b756f6c Test the Cluster Shard Allocation Explain API with closed indices (#38631)
    * c484c66 Remove index routing table of closed indices in mixed versions clusters (#38955)
    * 00f1828 Mute CloseFollowerIndexIT.testCloseAndReopenFollowerIndex()
    * e845b0a Do not schedule Refresh/Translog/GlobalCheckpoint tasks for closed indices (#38329)
    * cf9a015 Adapt testIndexCanChangeCustomDataPath for replicated closed indices (#38327)
    * b9becdd Adapt testPendingTasks() for replicated closed indices (#38326)
    * 02cc730 Allow shards of closed indices to be replicated as regular shards (#38024)
    * e53a9be Fix compilation error in IndexShardIT after merge with master
    * cae4155 Relax NoOpEngine constraints (#37413)
    * 54d110b [RCI] Adapt NoOpEngine to latest FrozenEngine changes
    * c63fd69 [RCI] Add NoOpEngine for closed indices (#33903)
    
    Relates to #33888
2019-03-01 14:48:26 +01:00
Andrei Stefan 06d0e0efad Removed custom naming for DISTINCT COUNT (#39537)
(cherry picked from commit 9412a2ee01a60dd6449bbced1273ec0b37b65589)
2019-03-01 15:26:32 +02:00
Andrei Stefan ba44f28340 SQL: ignore UNSUPPORTED fields for JDBC and ODBC modes in 'SYS COLUMNS' (#39518)
* SYS COLUMNS will skip UNSUPPORTED field types in ODBC and JDBC, as well.
NESTED and OBJECT types were already skipped in ODBC mode, now they are
skipped in JDBC mode, as well.

(cherry picked from commit 9e0df64b2d36c9069dfa506570468f0522c86417)
2019-03-01 15:26:31 +02:00
David Kyle 894ecb244d
[ML-Dataframe] Move dataframe actions into core (#39548) 2019-03-01 10:45:36 +00:00
Marios Trivyzas 9fb2f670dc SQL: Enhance checks for inexact fields (#39427)
For functions: move checks for `text` fields without underlying `keyword`
fields or with many of them (ambiguity) to the type resolution stage.

For Order By/Group By: move checks to the `Verifier` to catch early
before `QueryTranslator` or execution.

Closes: #38501
Fixes: #35203
2019-03-01 10:40:57 +01:00
Albert Zaharovits 8a19d981db Integ test snapshot and restore for native realm (#39123)
This commit adds a simple integ test that exercises the flow:
* snapshot .security
* delete .security
* restore .security

, checking that the Native Realm works as expected.

Relates #34454
2019-02-28 14:41:47 +02:00
Hendrik Muhs 30e5c11cc2
[ML-DataFrame] Dataframe REST cleanups (#39451) (#39503)
fix a couple of odd behaviors of data frame transforms REST API's:

 -  check if id from body and id from URL match if both are specified
 -  do not allow a body for delete
 -  allow get and stats without specifying an id
2019-02-28 13:00:37 +01:00
Dimitris Athanasiou 8122650a55 [ML] Add integration test for interim results after advancing bucket (#39447)
This is an integration test that captures the issue described in
elastic/ml-cpp#324
2019-02-28 11:12:08 +02:00
Ioannis Kakavas 2ce9457c8f Mute Bulk indexing of monitoring data (#39448)
Relates: #30101
2019-02-28 07:40:36 +02:00
Lee Hinman ad8228aec9
Use non-ILM template setting up watch history template & ILM disabled (#39420)
Backport of #39325

When ILM is disabled and Watcher is setting up the templates and policies for
the watch history indices, it will now use a template that does not have the
`index.lifecycle.name` setting, so that indices are not created with the
setting.

This also adds tests for the behavior, and changes the cluster state used in
these tests to be real instead of mocked.

Resolves #38805
2019-02-27 11:11:19 -07:00
Jay Modi 995144b197
Fix SSLConfigurationReloaderTests failure tests (#39408)
This change fixes the tests that expect the reload of a
SSLConfiguration to fail. The tests relied on an incorrect assumption
that the reloader only called reload on for an SSLConfiguration if the
key and trust managers were successfully reloaded, but that is not the
case. This change removes the fail call with a wrapped call to the
original method and captures the exception and counts down a latch to
make these tests consistently tested.

Closes #39260
2019-02-27 09:17:09 -07:00
Alan Woodward 71b8494181
Upgrade to lucene 8.0.0-snapshot-ff9509a8df (#39444)
Backport of #39350

Contains the following:

* LUCENE-8635: Move terms dictionary off-heap for non-primary-key fields in `MMapDirectory`
* LUCENE-8292: `TermsEnum` is fully abstract
* LUCENE-8679: Return WITHIN in `EdgeTree#relateTriangle` only when polygon and triangle share one edge
* LUCENE-8676: Nori tokenizer deals correctly with large buffers
* LUCENE-8697: `GraphTokenStreamFiniteStrings` better handles side paths with gaps
* LUCENE-8664: Add `equals` and `hashCode` to `TotalHits`
* LUCENE-8660: `TopDocsCollector` returns accurate hit counts if the total equals the threshold
* LUCENE-8654: `Polygon2D#relateTriangle` fix for when the polygon is inside the triangle
* LUCENE-8645: `Intervals#fixField` can merge intervals from different fields
* LUCENE-8585: Create jump-tables for DocValues at index time
2019-02-27 14:36:08 +00:00
Marios Trivyzas a2c07b5011
SQL: Use underlying exact field for LIKE/RLIKE (#39443)
Previously, if a text field had an underlying keyword field
the latter was not used instead of the text leading to wrong
results returned by queries filtering with LIKE/RLIKE.

Fixes: #39442
2019-02-27 14:46:54 +01:00
Mehran Koushkebaghi 1d0097b5e8 [ML] Refactoring scheduled event to store instant instead of zoned time zone (#39380)
The ScheduledEvent class has never preserved the time
zone so it makes more sense for it to store the start and
end time using Instant rather than ZonedDateTime.

Closes #38620
2019-02-27 09:27:04 +00:00
Andrei Stefan 542e2c55f6 SQL: change the default precision for CURRENT_TIMESTAMP function (#39391)
(cherry picked from commit dbb93310b083226c96e4bde3eef0079eb01cbca9)
2019-02-27 09:49:42 +02:00
Andrei Stefan 4deb69e9e4 SQL: introduce the columnar option for REST requests (#39287)
* Add "columnar" option for REST requests (but be lenient for non-"plain"
modes) for json, yaml, smile and cbor formats.
* Updated documentation

(cherry picked from commit 5b7e0de237fb514d14a61a347bc669d4b4adbe56)
2019-02-27 09:37:28 +02:00
Andrei Stefan d16edf0462 Randomize the timezone for equals and hashcode tests (#39353) 2019-02-27 07:39:06 +02:00
Tim Brooks f24dae302d
Make security tests transport agnostic (#39411)
Currently there are two security tests that specifically target the
netty security transport. This PR moves the client authentication tests
into `AbstractSimpleSecurityTransportTestCase` so that the nio transport
will also be tested.

Additionally the work to build transport configurations is moved out of
the netty transport and tested independently.
2019-02-26 18:55:19 -07:00
Tim Vernum 30687cbe7f
Switch internal security index to ".security-7" (#39422)
This changes the name of the internal security index to ".security-7",
but supports indices that were upgraded from earlier versions and use
the ".security-6" name.

In all cases, both ".security-6" and ".security-7" are considered to
be restricted index names regardless of which name is actually in use
on the cluster.

Backport of: #39337
2019-02-27 12:49:44 +11:00
Yogesh Gaikwad 0c7310936b
Fixed required fields and paths list (#39358) (#39428)
Some small fix for the `x-pack` rest api spec.

* In both `security.enable_user.json` and `security.disable_user.json`
   the `username` parameter was `false` instead of `true`
   (the documentation is already correct).
* In `security.get_privileges.json` there were missing all the
   possible paths since the path parameters are not required.
   This fix aligns the document with the rest of the spec,
   where all the possible combinations are listed.
2019-02-27 12:40:15 +11:00
Gordon Brown f4c5abe4d4
Handle failure to release retention leases in ILM (#39281) (#39417)
It is possible that the Unfollow API may fail to release shard history
retention leases when unfollowing, so this needs to be handled by the
ILM Unfollow action. There's nothing much that can be done automatically
about it from the follower side, so this change makes the ILM unfollow
action simply ignore those failures.
2019-02-26 16:58:30 -07:00
Martijn van Groningen 24e478c58e
Fix test, more than one node may be connected.
Relates to #37681
2019-02-26 10:40:09 +01:00
Ioannis Kakavas 7f999c43b3
[BACKPORT-7.x] Fix TokenBackwardsCompatibility tests (#39294)
This change is a backport of  #39252

- Fixes TokenBackwardsCompatibilityIT: Existing tests seemed to made
  the assumption that in the oneThirdUpgraded stage the master node
  will be on the old version and in the twoThirdsUpgraded stage, the
  master node will be one of the upgraded ones. However, there is no
  guarantee that the master node in any of the states will or will
  not be one of the upgraded ones.
  This class now tests:
  - That we can generate and consume tokens before we start the
  rolling upgrade.
  - That we can consume tokens generated in the old cluster during
  all the stages of the rolling upgrade.
  - That while on a mixed cluster, when/if the master node is
  upgraded, we can generate, consume and refresh a token
  - That after the rolling upgrade, we can consume a token
  generated in an old cluster and can invalidate it so that it
  can't be used any more.
- Ensures that during the rolling upgrade, the upgraded nodes have
the same configuration as the old nodes. Specifically that the
file realm we use is explicitly named `file1`. This is needed
because while attempting to refresh a token in a mixed cluster
we might create a token hitting an old node and attempt to refresh
it hitting a new node. If the file realm name is not the same, the
refresh will be seen as being made by a "different" client, and
will, thus, fail.
- Renames the Authentication variable we check while refreshing a
token to be clientAuth in order to make the code more readable.

Some of the above were possibly causing the flakiness of #37379
2019-02-26 10:42:36 +02:00
Martijn van Groningen b159cc51c0
Ensure remote connection established and
clean remote connection prior to leader cluster restart

Relates to #37681
2019-02-26 09:06:30 +01:00
Nhat Nguyen e9dda75834 Enable soft-deletes by default for 7.0+ indices (#38929)
Today when users upgrade to 7.0, existing indices will automatically
switch to soft-deletes without an opt-out option. With this change, 
we only enable soft-deletes by default for new indices.

Relates #36141
2019-02-25 17:54:29 -05:00
Jason Tedor a6c0166d68
Renew retention leases while following (#39335)
This commit is the final piece of the integration of CCR with retention
leases. Namely, we periodically renew retention leases and advance the
retaining sequence number while following.
2019-02-25 17:14:19 -05:00
Lee Hinman 7b8178c839
Remove Hipchat support from Watcher (#39374)
* Remove Hipchat support from Watcher (#39199)

Hipchat has been shut down and has previously been deprecated in
Watcher (#39160), therefore we should remove support for these actions.

* Add migrate note
2019-02-25 15:08:46 -07:00
Benjamin Trent 926291aac8
[DATA-FRAME] Sort `GET` transforms and stats by ID (#39365) (#39369)
* [Data-Frame] Sort `GET` transforms and stats by ID

* removing unused import
2019-02-25 14:22:41 -06:00
Nhat Nguyen 0f29b89655 Unmute FollowerFailOverIT#testFailOverOnFollower
Relates #38633
2019-02-25 14:44:44 -05:00
Hendrik Muhs 1897883adc
[ML-DataFrame] Dataframe access headers (#39289) (#39368)
store user headers as part of the config and run transform as user
2019-02-25 19:08:26 +01:00
Benjamin Trent 3d49523726
[DATA-FRAME] adds specs and yml tests for existing endpoints (#39326) (#39363)
* [DATA-FRAME] adds specs and yml tests for existing endpoints

* removing bad URL, adding test for _all
2019-02-25 11:19:49 -06:00
Nhat Nguyen 48219112e3 Do not wait for advancement of checkpoint in recovery (#39006)
With this change, we won't wait for the local checkpoint to advance to
the max_seq_no before starting phase2 of peer-recovery. We also remove
the sequence number range check in peer-recovery. We can safely do these
thanks to Yannick's finding.

The replication group to be used is currently sampled after indexing
into the primary (see `ReplicationOperation` class). This means that
when initiating tracking of a new replica, we have to consider the
following two cases:

- There are operations for which the replication group has not been
sampled yet. As we initiated the new replica as tracking, we know that
those operations will be replicated to the new replica and follow the
typical replication group semantics (e.g. marked as stale when
unavailable).

- There are operations for which the replication group has already been
sampled. These operations will not be sent to the new replica.  However,
we know that those operations are already indexed into Lucene and the
translog on the primary, as the sampling is happening after that. This
means that by taking a snapshot of Lucene or the translog, we will be
getting those ops as well. What we cannot guarantee anymore is that all
ops up to `endingSeqNo` are available in the snapshot (i.e.  also see
comment in `RecoverySourceHandler` saying `We need to wait for all
operations up to the current max to complete, otherwise we can not
guarantee that all operations in the required range will be available
for replaying from the translog of the source.`). This is not needed,
though, as we can no longer guarantee that max seq no == local
checkpoint.

Relates #39000
Closes #38949

Co-authored-by: Yannick Welsch <yannick@welsch.lu>
2019-02-25 12:10:14 -05:00
Martijn van Groningen 6f69ef165b
Protect against the leader index being removed (#39351)
when dealing with TimeoutException

The `IndexFollowingIT#testDeleteLeaderIndex()`` test failed,
because a NPE was captured as fatal error instead of an IndexNotFoundException.

Closes #39308
2019-02-25 13:40:10 +01:00
Martijn van Groningen 9bf0538878
Wait for index following is active for auto followed index (#39175)
before executing pause follow api:

https://github.com/elastic/elasticsearch/issues/39126#issuecomment-465512002

Closes #39126
2019-02-25 10:44:20 +01:00
Jason Tedor 6e06f82106
Fix failing CCR retention lease test
Finally! This commit should fix the issues with the CCR retention lease
that has been plaguing build failures. The issue here is that we are
trying to prevent the clear session requests from being executed until
after we have been able to validate that retention leases are being
renewed. However, we were only blocking the clear session requests but
not blocking them when they are proxied through another node. This
commit addresses that.

Relates #39268
2019-02-22 20:43:39 -05:00
Jason Tedor 2d4c98a991
Change sort order of shard stats in CCR test
This commit changes the sort order of shard stats that are collected in
CCR retention lease integration tests. This change is done so that
primaries appear first in sort order.
2019-02-22 18:17:28 -05:00
Jason Tedor e569cf8324
Address failing CCR retention lease test
This test fails rarely but it is flaky in its current form. The problem
here is that we lack a guarantee on the retention leases having been
synced to all shard copies. We need to sleep long enough to ensure that
that occurs, and then we can sample the retention leases, possibly sleep
again (we usually will not have too since the first sleep will have been
long enough to allow a sync and a renewal to happen, if one was going to
happen), and the sample the retention leases for comparison.

Closes #39331
2019-02-22 18:15:10 -05:00
Jason Tedor e4e96b8181
Fix shard logged in background lease renewal
The shard logged here is the leader shard but it should be the follower
shard since this background retention lease renewal is happening on the
follower side. This commit fixes that.
2019-02-22 17:32:51 -05:00
Jason Tedor feb25c71a0
Simplify mocking in CCR retention lease tests
This commit simplifies the use of transport mocking in the CCR retention
lease integration tests. Instead of adding a send rule between nodes, we
add a default send rule. This greatly simplifies the code here, and
speeds the test up a little bit too.
2019-02-22 17:24:12 -05:00
Tim Brooks 931953a3ee
Ensure index commit released when testing timeouts (#39273)
This fixes #39245. Currently it is possible in this test that the clear
session call times-out. This means that the index commit will not be
released and there will be an assertion triggered in the test teardown.
This commit ensures that we wipe the leader index in the test to avoid
this assertion.

It is okay if the clear session call times-out in normal usage. This
scenario is unavoidable due to potential network issues. We have a local
timeout on the leader to clean it up when this scenario happens.
2019-02-22 11:14:42 -07:00
Benjamin Trent 3262d6c917
[ML-DataFrame] Add _preview endpoint (#38924) (#39319)
* [DATA-FRAME] add preview endpoint

* adjusting preview tests and fixing parser

* adjusing preview transport

* remove unused import

* adjusting test

* Addressing PR comments

* Fixing failing test and adjusting for pr comments

* fixing integration test
2019-02-22 10:55:38 -06:00
David Roberts 4f2bd238d2 [ML] Increase datafeed integration test timeout for slow machines (#39311)
The assertBusy() that waits the default 10 seconds for a
datafeed to complete very occasionally times out on slow
machines.  This commit increases the timeout to 60 seconds.
It will almost never actually take this long, but it's
better to have a timeout that will prevent time being
wasted looking at spurious test failures.
2019-02-22 15:35:32 +00:00
Gordon Brown 2ad1e6aedc
Fix testCannotShrinkLeaderIndex (#38529)
This test should no longer pass when the functionality it is intended to
test is broken, as it now indexes a number of documents and verifies
that the index is staying on the same step until after indexing and
replication of those documents is finished. This prevents the test from
passing if the leader index progresses in its lifecycle during that time.
2019-02-22 08:03:36 -07:00
Dimitris Athanasiou 1c6818fe74
[ML] Improve DeleteExpiredDataIT failure message (#39298) (#39310)
This test failed once in a very long time with the assertion
that there is no document for the `non_existing_job` in the
state index. I could not see how that is possible and I cannot
reproduce. With this commit the failure message will reveal
some examples of the left behind docs which might shed a light
about what could go wrong.
2019-02-22 16:15:11 +02:00
Daniel Mitterdorfer 9fea21aca5
Remove ExceptionsHelper#detailedMessage in tests (#37921) (#39297)
With this commit we remove all usages of the deprecated method
`ExceptionsHelper#detailedMessage` in tests. We do not address
production code here but rather in dedicated follow-up PRs to keep the
individual changes manageable.

Relates #19069
2019-02-22 14:03:29 +01:00
Lee Hinman 3401afdf35 Add ILM plugin for MonitoringIT tests (#39271)
Without this, when creating the watch history indices they complain about there
being no such setting as `index.lifecycle.name`.

Relates to #38805
2019-02-21 21:45:43 -07:00
Julie Tibshirani 29243f7001 Avoid using TimeWarp in TransformIntegrationTests. (#39277)
This commit makes `TransformIntegrationTests` into a standard integration test, as
opposed to using `TimeWarp`, which registers the mock component
`ScheduleEngineTriggerMock` to trigger watches.

The simplification may help with flakiness we've observed `TimeWarp, as in #37882.
2019-02-21 18:02:44 -08:00
Jay Modi 697911c31d
Fixed missed stopping of SchedulerEngine (#39193)
The SchedulerEngine is used in several places in our code and not all
of these usages properly stopped the SchedulerEngine, which could lead
to test failures due to leaked threads from the SchedulerEngine. This
change adds stopping to these usages in order to avoid the thread leaks
that cause CI failures and noise.

Closes #38875
2019-02-21 14:31:33 -07:00
Tim Brooks 44df76251f
Rebuild remote connections on profile changes (#39146)
Currently remote compression and ping schedule settings are dynamic.
However, we do not listen for changes. This commit adds listeners for
changes to those two settings. Additionally, when those settings change
we now close existing connections and open new ones with the settings
applied.

Fixes #37201.
2019-02-21 14:00:39 -07:00
Benjamin Trent 34d06471c3
[CI] Mute CcrRetentionLeaseIT.testRetentionLeaseIsRenewedDuringRecovery (#39270) 2019-02-21 14:17:03 -06:00
Benjamin Trent 8072543428
Muting AutoFollowIT.testAutoFollowManyIndices (#39265) 2019-02-21 13:43:09 -06:00
Jason Tedor b9f8be6968
Clarify the use of sleep in CCR test
Sleeps in tests smell funny, and we try to avoid them to the extent
possible. We are using a small one in a CCR test. This commit clarifies
the purpose of that sleep by adding a comment explaining it. We also
removed a hard-coded value from the test, that if we ever modified the
value higher up where it was set, we could end up forgetting to change
the value here. Now we ensure that these would move in lock step if we
ever maintain them later.
2019-02-21 14:05:48 -05:00
Jason Tedor 719c38a36d
Fix CCR tests that manipulate transport requests
We have some CCR tests where we use mock transport send rules to control
the behavior that we desire in these tests. Namely, we want to simulate
an exception being thrown on the leader side, or a variety of other
situations. These send rules were put in place between the data nodes on
each side. However, it might not be the case that these requests are
being sent between data nodes. For example, a request that is handled on
a non-data master node would not be sent from a data node. And it might
not be the case that the request is sent to a data node, as it could be
proxied through a non-data coordinating node. This commit addresses this
by putting these send rules in places between all nodes on each side.

Closes #39011
Closes #39201
2019-02-21 12:26:09 -05:00
Tanguy Leroux fc896e452c
ReadOnlyEngine should update translog recovery state information (#39238) (#39251)
`ReadOnlyEngine` never recovers operations from translog and never 
updates translog information in the index shard's recovery state, even 
though the recovery state goes through the `TRANSLOG` stage during 
the recovery. It means that recovery information for frozen shards indicates 
an unkown number of recovered translog ops in the Recovery APIs 
(translog_ops: `-1` and translog_ops_percent: `-1.0%`) and this is confusing.

This commit changes the `recoverFromTranslog()` method in `ReadOnlyEngine` 
so that it always recover from an empty translog snapshot, allowing the recovery 
state translog information to be correctly updated.

Related to #33888
2019-02-21 18:08:06 +01:00
Martijn van Groningen f40139c403
Change ShardFollowTask to reuse common serialization logic (#39094)
Initially in #38910, ShardFollowTask was reusing ImmutableFollowParameters'
serialization logic. After merging, bwc tests failed sometimes and
the binary serialization that ShardFollowTask was originally was using
was added back. ImmutableFollowParameters is using optional fields (optional vint)
while ShardFollowTask was not (vint).
2019-02-21 09:32:33 +01:00
Nhat Nguyen a96df5d209 Reduce refresh when lookup term in FollowingEngine (#39184)
Today we always refresh when looking up the primary term in
FollowingEngine. This is not necessary for we can simply
return none for operations before the global checkpoint.
2019-02-20 19:21:00 -05:00
Nhat Nguyen cdec11c4eb Relax history check in ShardFollowTaskReplicationTests (#39162)
The follower won't always have the same history as the leader for its
soft-deletes retention can be different. However, if some operation
exists on the history of the follower, then the same operation must
exist on the leader. This change relaxes the history check in
ShardFollowTaskReplicationTests.

Closes #39093
2019-02-20 19:21:00 -05:00
Nhat Nguyen 820ba8169e Add retention leases replication tests (#38857)
This commit introduces the retention leases to ESIndexLevelReplicationTestCase,
then adds some tests verifying that the retention leases replication works
correctly in spite of the presence of the primary failover or out of order
delivery of retention leases sync requests.

Relates #37165
2019-02-20 19:21:00 -05:00
Mark Vieira 24ac9da276
Mute CCR retention test that is consistently failing locally and in CI 2019-02-20 11:57:46 -08:00
Jay Modi af451459a5
Fix failures in SessionFactoryLoadBalancingTests (#39154)
This change aims to fix failures in the session factory load balancing
tests that mock failure scenarios. For these tests, we randomly shut
down ldap servers and bind a client socket to the port they were
listening on. Unfortunately, we would occasionally encounter failures
in these tests where a socket was already in use and/or the port
we expected to connect to was wrong and in fact was to one of the ldap
instances that should have been shut down.

The failures are caused by the behavior of certain operating systems
when it comes to binding ports and wildcard addresses. It is possible
for a separate application to be bound to a wildcard address and still
allow our code to bind to that port on a specific address. So when we
close the server socket and open the client socket, we are still able
to establish a connection since the other application is already
listening on that port on a wildcard address. Another variant is that
the os will allow a wildcard bind of a server socket when there is
already an application listening on that port for a specific address.

In order to do our best to prevent failures in these scenarios, this
change does the following:

1. Binds a client socket to all addresses in an awaitBusy
2. Adds assumption that we could bind all valid addresses
3. In the case that we still establish a connection to an address that
   we should not be able to, try to bind and expect a failure of not
   being connected

Closes #32190
2019-02-20 11:38:26 -07:00
Jason Tedor 90b1b36f50
Add cleanup logic to CCR retention lease test
This commit adds some logic to remove the mock transport rules at the
end of a CCR retention lease test.
2019-02-20 13:20:07 -05:00
Jason Tedor cfd7c77b64
Fix broken CCR retention lease unfollow test
This commit fixes a broken CCR retention lease unfollow test. The
problem with the test is that the random subset of shards that we picked
to disrupt would not necessarily overlap with the actual shards in
use. We could take a non-empty subset of [0, 3] (e.g., { 2 }) when the
only shard IDs in use were [0, 1]. This commit fixes this by taking into
account the number of shards in use in the test.

With this change, we also take measure to ensure that a successful
branch is tested more frequently than would otherwise be the case. On
that branch, we want to sometimes pretend that the retention lease is
already removed. The randomness here was also sometimes selecting a
subset of shards that did not overlap with the shards actually in use
during the test. While this does not break the test, it is confusing and
reduces the amount of coverage of that branch.

Relates #39185
2019-02-20 12:09:28 -05:00
Albert Zaharovits af8ef1bb98 Do not create the missing index when invoking getRole (#39039)
In most of the places we avoid creating the `.security` index (or updating the mapping)
for read/search operations. This is more of a nit for the case of the getRole call,
that fixes a possible mapping update during a get role, and removes a dead if branch
about creating the `.security` index.
2019-02-20 17:33:10 +02:00