Commit Graph

3633 Commits

Author SHA1 Message Date
Alexander Reelsen 4ecf234617 Upgrade to joda 2.10.4 (#47805) 2019-10-31 14:49:50 +01:00
emasab 185e067442 SQL: Failing Group By queries due to different ExpressionIds (#43072)
Fix an issue that arises from the use of ExpressionIds as keys in a lookup map
that helps the QueryTranslator to identify the grouping columns. The issue is
that the same expression in different parts of the query (SELECT clause and GROUP BY clause)
ends up with different ExpressionIds so the lookup fails. So, instead of ExpressionIds
use the hashCode() of NamedExpression.

Fixes: #41159
Fixes: #40001
Fixes: #40240
Fixes: #33361
Fixes: #46316
Fixes: #36074
Fixes: #34543
Fixes: #37044

Fixes: #42041
(cherry picked from commit 3c38ea555984fcd2c6bf9e39d0f47a01b09e7c48)
2019-10-31 14:49:16 +01:00
Martijn van Groningen c358ecb5fb
Don't preserve indices between enrich qa tests.
This was added because it was suspected to cause the monitoring
enrich verification to fail, but that is not the case.

See #48258
2019-10-31 14:23:56 +01:00
Andrei Dan ffe5d5417f
ILM Make the `check-rollover-ready` step retryable (#48256) (#48740)
This adds the infrastructure to be able to retry the execution of retryable
steps and makes the `check-rollover-ready` retryable as an initial step to
make the rollover action more resilient to transient errors.

(cherry picked from commit 454020ac8acb147eae97acb4ccd6fb470d1e5f48)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2019-10-31 11:28:55 +00:00
David Roberts c3063c4e1f [ML] Make the URL of the ML C++ Ivy repo configurable (#48702)
At present the ML C++ artifact is always downloaded from
S3.  This change adds an option to configure the location.

(The intention is to use a file:/// URL to pick up the
artifact built in a Docker container in ml-cpp PR builds
so that C++ changes that will break Java integration tests
can be detected before the ml-cpp PRs are merged.)

Relates elastic/ml-cpp#766
2019-10-31 09:21:44 +00:00
Dimitris Athanasiou 919596b2e8
[7.x][ML] Move field extraction logic to its own package (#48709) (#48712)
Moves common field extraction logic to its own package so that it can
be used both for anomaly detection and data frame analytics.

In preparation for refactoring extraction fields to be simpler and to
support multi-fields properly.

Backport of #48709
2019-10-31 02:41:00 +02:00
Yogesh Gaikwad c7342dde29
Fix to release system resource after reading JKWSet file (#48666) (#48677)
When we load a JSON Web Key (JWKSet) from the specified
file using JWKSet.load it internally uses IOUtils.readFileToString
but the opened FileInputStream is never closed after usage.
https://bitbucket.org/connect2id/nimbus-jose-jwt/issues/342

This commit reads the file and parses the JWKSet from the string.

This also fixes an issue wherein if the underlying file changed,
for every change event it would add another file watcher. The
change is to only add the file watcher at the start.

Closes #44942
2019-10-31 10:16:33 +11:00
Lee Hinman 2d5291cf3b Un-AwaitsFix and enhance logging for testPolicyCRUD (#48719)
* Un-AwaitsFix and enhance logging for testPolicyCRUD

This removes the `AwaitsFix` and increases the test logging for
`SnapshotLifecycleServiceTests.testPolicyCRUD` in an effort to track
down the cause of #44997.

* Remove unused import
2019-10-30 17:02:57 -06:00
Julie Tibshirani ae1ef5fd92 Refactor unit tests for vector functions. (#48662)
This PR performs the following changes:
* Split `ScoreScriptUtilsTests` into `DenseVectorFunctionTests` and
`SparseVectorFunctionTests`. This will make it easier to delete all sparse
vector function tests once we remove support on 8.x.
* As much as possible, break up the large test methods into individual tests
for each vector function (`cosineSimilarity`, `l2norm`, etc.).
2019-10-30 15:36:06 -07:00
Lee Hinman ed2bb73de2 Fix SnapshotLifecycleService logger (#48711)
The logger was erroneously using the `SnapshotLifecycleMetadata` class
for its initialization, making it hard to target packages for logging
levels since `SnapshotLifecycleMetadata` is in a different package.
2019-10-30 13:13:50 -06:00
Benjamin Trent c9ead80c31
[7.x] [ML][Inference] separating definition and config object storage (#48651) (#48695)
* [ML][Inference] separating definition and config object storage (#48651)

This separates out the `definition` object from being stored within the configuration object in the index. 

This allows us to gather the config object without decompressing a potentially large definition.

Additionally, `input` is moved to the TrainedModelConfig object and out of the definition. This is so the trained input fields are accessible outside the potentially large model definition.
2019-10-30 13:27:29 -04:00
Lee Hinman 72a601c47f
[7.x] Don't schedule SLM jobs when services have been stopped… (#48692)
This adds a guard for the SLM lifecycle and retention service that
prevents new jobs from being scheduled once the service has been
stopped. Previous if the node were shut down the service would be
stopped, but a cluster state or local master election would cause a job
to attempt to be scheduled. This could lead to an uncaught
`RejectedExecutionException`.

Resolves #47749
2019-10-30 09:46:35 -06:00
Armin Braun 52e5ceb321
Restore from Individual Shard Snapshot Files in Parallel (#48110) (#48686)
Make restoring shard snapshots run in parallel on the `SNAPSHOT` thread-pool.
2019-10-30 14:36:30 +01:00
Yogesh Gaikwad 9ed7352a12
Add Sysprop to Adjust IO Buffer Size (#48267) (#48667)
The 1MB IO-buffer size per transport thread is causing trouble in
some tests, albeit at a low rate. Reducing the number of transport
threads was not enough to fully fix this situation.
Allowing to configure the size of the buffer and reducing it by
more than an order of magnitude should fix these tests.

Closes #46803
2019-10-30 14:19:54 +11:00
Yogesh Gaikwad 1b64c1992a Add owner flag parameter to the rest spec (#48500)
This commit adds missing info about newly added
`owner` flag to the rest spec, also adds a rest test
for the same.

Closes#48499
2019-10-30 13:07:01 +11:00
Julie Tibshirani 89c65752dc
Update the signature of vector script functions. (#48653)
Previously the functions accepted a doc values reference, whereas they now
accept the name of the vector field. Here's an example of how a vector function
was called before and after the change.

```
Before: cosineSimilarity(params.query_vector, doc['field'])
After:  cosineSimilarity(params.query_vector, 'field')
```

This seems more intuitive, since we don't allow direct access to vector doc
values and the the meaning of `doc['field']` is unclear.

The PR makes the following changes (broken into distinct commits):
* Add new function signatures of the form `function(params.query_vector,
'field')` and deprecates the old ones. Because Painless doesn't allow two
methods with the same name and number of arguments, we allow a generic `Object`
to be passed in to the function and decide on the behavior through an
`instanceof` check.
* Refactor the class bindings so that the document field is passed to the
constructor instead of the instance method. This allows us to avoid retrieving
the vector doc values on every function invocation, which gives a tiny speed-up
in benchmarks.

Note that this PR adds new signatures for the sparse vector functions too, even
though sparse vectors are deprecated. It seemed simplest to understand (for both
us and users) to keep everything symmetric between dense and sparse vectors.
2019-10-29 15:46:05 -07:00
Gordon Brown 25724c5c46
Adjust date parsing in ILM integration tests (#48648)
The format returned by the API is not always parsable with
`Instant.parse()`, so this commit adjusts to parsing those dates as
`ISO_ZONED_DATE_TIME` instead, which appears to always parse the
returned value correctly.
2019-10-29 15:44:04 -07:00
Gordon Brown 50d7424e7d
Unmute and increase logging on flaky SLM tests (#48612)
The failures in these tests have been remarkably difficult to track
down, in part because they will not reproduce locally. This commit
unmutes the flaky tests and increases logging, as well as introducing
some additional logging, to attempt to pin down the failures.
2019-10-29 13:39:19 -07:00
Andrei Dan 8b22e297ed
ILM open/close steps are noop if idx is open/close (#48614) (#48640)
The open and close follower steps didn't check if the index is open,
closed respectively, before executing the open/close request.
This changes the steps to check the index state and only perform the
open/close operation if the index is not already open/closed.
2019-10-29 17:43:56 +00:00
Gordon Brown cf235796c0
Use more reliable "never run" cron pattern in tests (#48608)
The cron schedule "1 2 3 4 5 ?" will run every May 4 at 03:02:01, which
may result in unnecessary test failures once a year. This commit
switches out uses of that schedule in tests for one which will never
execute (because it specifies a day which doesn't exist, Feb. 31). Also
factors the schedule out to a constant to make the intent clearer.
2019-10-29 09:33:14 -07:00
Przemysław Witek 7c944d26c5
[7.x] Assert that the results of classification analysis can be evaluated using _evaluate API. (#48626) (#48634) 2019-10-29 16:20:56 +01:00
Ioannis Kakavas a0362153e2
Update oauth2-oidc-sdk and nimbus-jose-jwt (#48537) (#48628)
Update two dependencies for our OpenID Connect realm implementation
to their latest versions
2019-10-29 14:18:59 +02:00
Mark Vieira e5c6440a4f
Simplify usage of Gradle Shadow plugin (#48478) (#48597)
This commit simplifies and standardizes our usage of the Gradle Shadow
plugin to conform more to plugin conventions. The custom "bundle" plugin
has been removed as it's not necessary and performs the same function
as the Shadow plugin's default behavior with existing configurations.

Additionally, this removes unnecessary creation of a "nodeps" artifact,
which is unnecessary because by default project dependencies will in
fact use the non-shadowed JAR unless explicitly depending on the
"shadow" configuration.

Finally, we've cleaned up the logic used for unit testing, so we are
now correctly testing against the shadow JAR when the plugin is applied.
This better represents a real-world scenario for consumers and provides
better test coverage for incorrectly declared dependencies.

(cherry picked from commit 3698131109c7e78bdd3a3340707e1c7b4740d310)
2019-10-28 12:11:55 -07:00
Benjamin Trent 6ea59dd428
[ML][Transforms] add wait_for_checkpoint flag to stop (#47935) (#48591)
Adds `wait_for_checkpoint` for `_stop` API.
2019-10-28 13:02:57 -04:00
Gordon Brown 5021410165
Retry on RepositoryException in SLM tests (#48548)
Due to a bug, GETing a snapshot can cause a RespositoryException to be
thrown. This error is transient and should be retried, rather than
causing the test to fail. This commit converts those
RepositoryExceptions into AssertionErrors so that they will be retried
in code wrapped in assertBusy.
2019-10-28 09:24:38 -07:00
Gordon Brown c353ad71fe
Wrap ResponseException in AssertionError in ILM/CCR tests (#48489)
When checking for the existence of a document in the ILM/CCR integration
tests, `assertDocumentExists` makes an HTTP request and checks the
response code. However, if the repsonse code is not successful, the call
will throw a `ResponseException`. `assertDocumentExists` is often called
inside an `assertBusy`, and wrapping the `ResponseException` in an
`AssertionError` will allow the `assertBusy` to retry.

In particular, this fixes an issue with `testCCRUnfollowDuringSnapshot`
where the index in question may still be closed when the document is
requested.
2019-10-28 07:37:52 -07:00
Marios Trivyzas 124f6d098b
SQL: [Tests] Renable CliSecurityIT (#48581)
Seems that the issue has been fixed with: #48098

Closes: #48117
(cherry picked from commit 470362361ffce794a6a12ce7a81a8029ec7d54de)
2019-10-28 15:08:38 +01:00
Przemysław Witek 7e30277a37
Mute RegressionIT.testStopAndRestart (#48575) (#48576) 2019-10-28 13:08:11 +01:00
Rory Hunter 30389c6660
Improve SAML tests resiliency to auto-formatting (#48517)
Backport of #48452.

The SAML tests have large XML documents within which various parameters
are replaced. At present, if these test are auto-formatted, the XML
documents get strung out over many, many lines, and are basically
illegible.

Fix this by using named placeholders for variables, and indent the
multiline XML documents.

The tests in `SamlSpMetadataBuilderTests` deserve a special mention,
because they include a number of certificates in Base64. I extracted
these into variables, for additional legibility.
2019-10-27 16:06:23 +00:00
Jim Ferenczi 7fc413c22c Resolve the role query and the number of docs lazily (#48036)
This commit ensures that the creation of a DocumentSubsetReader does not
eagerly resolve the role query and the number of docs that match.
We want to delay this expensive operation in order to ensure that we really
need this information when we build it. For this reason the role query and the
number of docs are now resolved on demand. This commit also depends on
https://issues.apache.org/jira/browse/LUCENE-9003 that will also compute the global
number of docs lazily.
2019-10-25 18:12:29 +02:00
Tim Brooks f5f1072824
Multiple remote connection strategy support (#48496)
* Extract remote "sniffing" to connection strategy (#47253)

Currently the connection strategy used by the remote cluster service is
implemented as a multi-step sniffing process in the
RemoteClusterConnection. We intend to introduce a new connection strategy
that will operate in a different manner. This commit extracts the
sniffing logic to a dedicated strategy class. Additionally, it implements
dedicated tests for this class.

Additionally, in previous commits we moved away from a world where the
remote cluster connection was mutable. Instead, when setting updates are
made, the connection is torn down and rebuilt. We still had methods and
tests hanging around for the mutable behavior. This commit removes those.

* Introduce simple remote connection strategy (#47480)

This commit introduces a simple remote connection strategy which will
open remote connections to a configurable list of user supplied
addresses. These addresses can be remote Elasticsearch nodes or
intermediate proxies. We will perform normal clustername and version
validation, but otherwise rely on the remote cluster to route requests
to the appropriate remote node.

* Make remote setting updates support diff strategies (#47891)

Currently the entire remote cluster settings infrastructure is designed
around the sniff strategy. As we introduce an additional conneciton
strategy this infrastructure needs to be modified to support it. This
commit modifies the code so that the strategy implementations will tell
the service if the connection needs to be torn down and rebuilt.

As part of this commit, we will wait 10 seconds for new clusters to
connect when they are added through the "update" settings
infrastructure.

* Make remote setting updates support diff strategies (#47891)

Currently the entire remote cluster settings infrastructure is designed
around the sniff strategy. As we introduce an additional conneciton
strategy this infrastructure needs to be modified to support it. This
commit modifies the code so that the strategy implementations will tell
the service if the connection needs to be torn down and rebuilt.

As part of this commit, we will wait 10 seconds for new clusters to
connect when they are added through the "update" settings
infrastructure.
2019-10-25 09:29:41 -06:00
Russ Cam b24bbd4296 Change policy_id to list type in slm.get_lifecycle (#47766)
This commit changes the REST API spec slm.get_lifecycle's policy_id url part to be of type "list", in line with other REST API specs that accept a comma-separated list of values.

Closes #47765
2019-10-25 09:04:25 +10:00
Tim Brooks c0b545f325
Make BytesReference an interface (#48486)
BytesReference is currently an abstract class which is extended by
various implementations. This makes it very difficult to use the
delegation pattern. The implication of this is that our releasable
BytesReference is a PagedBytesReference type and cannot be used as a
generic releasable bytes reference that delegates to any reference type.
This commit makes BytesReference an interface and introduces an
AbstractBytesReference for common functionality.
2019-10-24 15:39:30 -06:00
Michael Basnight d49958cef3 Remove deprecated test from the HLRC tests (#48424)
The AbstractHlrcWriteableXContentTestCase was replaced by a better test
case a while ago, and this is the last two instances using it. They have
been converted and the test is now deleted.

Ref #39745
2019-10-24 14:02:04 -05:00
Martijn van Groningen b034153df7
Change grok watch dog to be Matcher based instead of thread based. (#48346)
There is a watchdog in order to avoid long running (and expensive)
grok expressions. Currently the watchdog is thread based, threads
that run grok expressions are registered and after completion unregister.
If these threads stay registered for too long then the watch dog interrupts
these threads. Joni (the library that powers grok expressions) has a
mechanism that checks whether the current thread is interrupted and
if so abort the pattern matching.

Newer versions have an additional method to abort long running pattern
matching inside joni. Instead of checking the thread's interrupted flag,
joni now also checks a volatile field that can be set via a `Matcher`
instance. This is more efficient method for aborting long running matches.
(joni checks each 30k iterations whether interrupted flag is set vs.
just checking a volatile field)

Recently we upgraded to a recent joni version (#47374), and this PR
is a followup of that PR.

This change should also fix #43673, since it appears when unit tests
are ran the a test runner thread's interrupted flag may already have
been set, due to some thread reuse.
2019-10-24 15:34:01 +02:00
Dimitrios Liappis fc1b4ad23c
Mute testCCRUnfollowDuringSnapshot (#48464)
tracked in #48461
backport of #48462
2019-10-24 15:52:56 +03:00
Przemysław Witek 149537a165
Assert that inference model has been persisted (#48332) (#48453) 2019-10-24 14:18:43 +02:00
Dimitrios Liappis 4d0fb6e551
Mute testBasicTimeBasedRetenion (#48458)
tracked in #48017
backport of #48456
2019-10-24 14:53:12 +03:00
Hendrik Muhs ba1c13c47d [Transform] do not fail checkpoint creation due to global checkpoint mismatch (#48423)
Take the max if global checkpoints mismatch instead of throwing an exception. It turned out global
checkpoints can mismatch by design

fixes #48379
2019-10-24 12:22:07 +02:00
Ioannis Kakavas c6b733f1b4
Add populate_user_metadata in OIDC realm (#48357) (#48438)
Make populate_user_metadata configuration parameter
available in the OpenID Connect authentication realm

Resolves: #48217
2019-10-24 09:51:08 +03:00
Martijn van Groningen 05324b7f03
Muted verifying monitoring integration in enrich integration test.
Relates to #48258
2019-10-24 08:39:53 +02:00
Julie Tibshirani 2664cbd20b
Deprecate the sparse_vector field type. (#48368)
We have not seen much adoption of this experimental field type, and don't see a
clear use case as it's currently designed. This PR deprecates the field type in
7.x. It will be removed from 8.0 in a follow-up PR.
2019-10-23 16:35:03 -07:00
Igor Motov 8163e0a9e5 Mute XPackRestIT security/authz/14_cat_indices
Mutes "Test empty request while single authorized closed index"

Tracked by #47875
2019-10-23 14:17:44 -04:00
Jake Landis cf175da5a9
Ensure SLM stats does not block an in-place upgrade from 7.4 (… (#48411)
7.5+ for SLM requires [stats] object to exist in the cluster state.
When doing an in-place upgrade from 7.4 to 7.5+ [stats] does not exist
in cluster state, result in an exception on startup [1].

This commit moves the [stats] to be an optional object in the parser
and if not found will default to an empty stats object.

[1] Caused by: java.lang.IllegalArgumentException: Required [stats]
2019-10-23 11:21:39 -05:00
Przemyslaw Gomulka aaa6209be6
[7.x] [Java.time] Calculate week of a year with ISO rules BACKPORT(#48209) (#48349)
Reverting the change introducing IsoLocal.ROOT and introducing IsoCalendarDataProvider that defaults start of the week to Monday and requires minimum 4 days in first week of a year. This extension is using java SPI mechanism and defaults for Locale.ROOT only.
It require jvm property java.locale.providers to be set with SPI,COMPAT

closes #41670
backport #48209
2019-10-23 17:39:38 +02:00
Ioannis Kakavas 834f2b4546
Add brackets where necessary in error messages (#48140) (#48386)
This commit attempts to help error readability by adding brackets
where applicable/missing in saml errors.
2019-10-23 17:23:50 +03:00
Armin Braun 7215201406
Track Shard-Snapshot Index Generation at Repository Root (#48371)
This change adds a new field `"shards"` to `RepositoryData` that contains a mapping of `IndexId` to a `String[]`. This string array can be accessed by shard id to get the generation of a shard's shard folder (i.e. the `N` in the name of the currently valid `/indices/${indexId}/${shardId}/index-${N}` for the shard in question).

This allows for creating a new snapshot in the shard without doing any LIST operations on the shard's folder. In the case of AWS S3, this saves about 1/3 of the cost for updating an empty shard (see #45736) and removes one out of two remaining potential issues with eventually consistent blob stores (see #38941 ... now only the root `index-${N}` is determined by listing).

Also and equally if not more important, a number of possible failure modes on eventually consistent blob stores like AWS S3 are eliminated by moving all delete operations to the `master` node and moving from incremental naming of shard level index-N to uuid suffixes for these blobs.

This change moves the deleting of the previous shard level `index-${uuid}` blob to the master node instead of the data node allowing for a safe and consistent update of the shard's generation in the `RepositoryData` by first updating `RepositoryData` and then deleting the now unreferenced `index-${newUUID}` blob.
__No deletes are executed on the data nodes at all for any operation with this change.__

Note also: Previous issues with hanging data nodes interfering with master nodes are completely impossible, even on S3 (see next section for details).

This change changes the naming of the shard level `index-${N}` blobs to a uuid suffix `index-${UUID}`. The reason for this is the fact that writing a new shard-level `index-` generation blob is not atomic anymore in its effect. Not only does the blob have to be written to have an effect, it must also be referenced by the root level `index-N` (`RepositoryData`) to become an effective part of the snapshot repository.
This leads to a problem if we were to use incrementing names like we did before. If a blob `index-${N+1}` is written but due to the node/network/cluster/... crashes the root level `RepositoryData` has not been updated then a future operation will determine the shard's generation to be `N` and try to write a new `index-${N+1}` to the already existing path. Updates like that are problematic on S3 for consistency reasons, but also create numerous issues when thinking about stuck data nodes.
Previously stuck data nodes that were tasked to write `index-${N+1}` but got stuck and tried to do so after some other node had already written `index-${N+1}` were prevented form doing so (except for on S3) by us not allowing overwrites for that blob and thus no corruption could occur.
Were we to continue using incrementing names, we could not do this. The stuck node scenario would either allow for overwriting the `N+1` generation or force us to continue using a `LIST` operation to figure out the next `N` (which would make this change pointless).
With uuid naming and moving all deletes to `master` this becomes a non-issue. Data nodes write updated shard generation `index-${uuid}` and `master` makes those `index-${uuid}` part of the `RepositoryData` that it deems correct and cleans up all those `index-` that are unused.

Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>
2019-10-23 10:58:26 +01:00
Przemysław Witek 60d8ecb2b7
Mute ClassificationIT tests (#48338) (#48339) 2019-10-22 12:45:50 +02:00
Ioannis Kakavas 24e43dfa34
[7.x] Refactor FIPS BootstrapChecks to simple checks (#47499) (#48333)
FIPS 140 bootstrap checks should not be bootstrap checks as they
are always enforced. This commit moves the validation logic within
the security plugin.
The FIPS140SecureSettingsBootstrapCheck was not applicable as the
keystore was being loaded on init, before the Bootstrap checks
were checked, so an elasticsearch keystore of version < 3 would
cause the node to fail in a FIPS 140 JVM before the bootstrap check
kicked in, and as such hasn't been migrated.

Resolves: #34772
2019-10-22 12:49:01 +03:00
Andrei Stefan 3233b59b68 Add "format" to "range" queries resulted from optimizing a logical AND (#48073)
(cherry picked from commit 020939a9bd5b34c6d540faa8b3a67b740d661be3)
2019-10-22 10:17:37 +03:00