OpenSearch

Commit Graph

Author	SHA1	Message	Date
Gordon Brown	25724c5c46	Adjust date parsing in ILM integration tests (#48648 ) The format returned by the API is not always parsable with `Instant.parse()`, so this commit adjusts to parsing those dates as `ISO_ZONED_DATE_TIME` instead, which appears to always parse the returned value correctly.	2019-10-29 15:44:04 -07:00
Gordon Brown	50d7424e7d	Unmute and increase logging on flaky SLM tests (#48612 ) The failures in these tests have been remarkably difficult to track down, in part because they will not reproduce locally. This commit unmutes the flaky tests and increases logging, as well as introducing some additional logging, to attempt to pin down the failures.	2019-10-29 13:39:19 -07:00
Andrei Dan	8b22e297ed	ILM open/close steps are noop if idx is open/close (#48614 ) (#48640 ) The open and close follower steps didn't check if the index is open, closed respectively, before executing the open/close request. This changes the steps to check the index state and only perform the open/close operation if the index is not already open/closed.	2019-10-29 17:43:56 +00:00
Lisa Cawley	be9df101bf	[DOCS] Adds missing references to oidc realms (#48224 )	2019-10-29 09:41:34 -07:00
Gordon Brown	cf235796c0	Use more reliable "never run" cron pattern in tests (#48608 ) The cron schedule "1 2 3 4 5 ?" will run every May 4 at 03:02:01, which may result in unnecessary test failures once a year. This commit switches out uses of that schedule in tests for one which will never execute (because it specifies a day which doesn't exist, Feb. 31). Also factors the schedule out to a constant to make the intent clearer.	2019-10-29 09:33:14 -07:00
Przemysław Witek	7c944d26c5	[7.x] Assert that the results of classification analysis can be evaluated using _evaluate API. (#48626 ) (#48634 )	2019-10-29 16:20:56 +01:00
Ioannis Kakavas	a0362153e2	Update oauth2-oidc-sdk and nimbus-jose-jwt (#48537 ) (#48628 ) Update two dependencies for our OpenID Connect realm implementation to their latest versions	2019-10-29 14:18:59 +02:00
Yannick Welsch	790cfc8ad2	Fix upgraded_scroll test (#48525 ) I think the problem is that the master is trying to relocate the "upgraded_scroll" shard back to the node on which it was previously allocated, but to which it can't be allocated now due to the shard lock being held because of an in-progress scroll. As the master keeps on retrying and retrying (and indefinitely tries so because max_retries does not apply to relocations, it blocks any other lower-prioritized task from completing, which leads to the rolling upgrade tests failing (see #48395). Closes #48395	2019-10-29 08:10:40 +01:00
Cris da Rocha	947f89a3a1	Update troubleshooting.asciidoc (#48516 )	2019-10-28 18:44:24 -07:00
Mark Vieira	e5c6440a4f	Simplify usage of Gradle Shadow plugin (#48478 ) (#48597 ) This commit simplifies and standardizes our usage of the Gradle Shadow plugin to conform more to plugin conventions. The custom "bundle" plugin has been removed as it's not necessary and performs the same function as the Shadow plugin's default behavior with existing configurations. Additionally, this removes unnecessary creation of a "nodeps" artifact, which is unnecessary because by default project dependencies will in fact use the non-shadowed JAR unless explicitly depending on the "shadow" configuration. Finally, we've cleaned up the logic used for unit testing, so we are now correctly testing against the shadow JAR when the plugin is applied. This better represents a real-world scenario for consumers and provides better test coverage for incorrectly declared dependencies. (cherry picked from commit 3698131109c7e78bdd3a3340707e1c7b4740d310)	2019-10-28 12:11:55 -07:00
Benjamin Trent	6ea59dd428	[ML][Transforms] add wait_for_checkpoint flag to stop (#47935 ) (#48591 ) Adds `wait_for_checkpoint` for `_stop` API.	2019-10-28 13:02:57 -04:00
Gordon Brown	5021410165	Retry on RepositoryException in SLM tests (#48548 ) Due to a bug, GETing a snapshot can cause a RespositoryException to be thrown. This error is transient and should be retried, rather than causing the test to fail. This commit converts those RepositoryExceptions into AssertionErrors so that they will be retried in code wrapped in assertBusy.	2019-10-28 09:24:38 -07:00
Gordon Brown	c353ad71fe	Wrap ResponseException in AssertionError in ILM/CCR tests (#48489 ) When checking for the existence of a document in the ILM/CCR integration tests, `assertDocumentExists` makes an HTTP request and checks the response code. However, if the repsonse code is not successful, the call will throw a `ResponseException`. `assertDocumentExists` is often called inside an `assertBusy`, and wrapping the `ResponseException` in an `AssertionError` will allow the `assertBusy` to retry. In particular, this fixes an issue with `testCCRUnfollowDuringSnapshot` where the index in question may still be closed when the document is requested.	2019-10-28 07:37:52 -07:00
Marios Trivyzas	124f6d098b	SQL: [Tests] Renable CliSecurityIT (#48581 ) Seems that the issue has been fixed with: #48098 Closes: #48117 (cherry picked from commit 470362361ffce794a6a12ce7a81a8029ec7d54de)	2019-10-28 15:08:38 +01:00
Przemysław Witek	7e30277a37	Mute RegressionIT.testStopAndRestart (#48575 ) (#48576 )	2019-10-28 13:08:11 +01:00
Rory Hunter	30389c6660	Improve SAML tests resiliency to auto-formatting (#48517 ) Backport of #48452. The SAML tests have large XML documents within which various parameters are replaced. At present, if these test are auto-formatted, the XML documents get strung out over many, many lines, and are basically illegible. Fix this by using named placeholders for variables, and indent the multiline XML documents. The tests in `SamlSpMetadataBuilderTests` deserve a special mention, because they include a number of certificates in Base64. I extracted these into variables, for additional legibility.	2019-10-27 16:06:23 +00:00
Jim Ferenczi	7fc413c22c	Resolve the role query and the number of docs lazily (#48036 ) This commit ensures that the creation of a DocumentSubsetReader does not eagerly resolve the role query and the number of docs that match. We want to delay this expensive operation in order to ensure that we really need this information when we build it. For this reason the role query and the number of docs are now resolved on demand. This commit also depends on https://issues.apache.org/jira/browse/LUCENE-9003 that will also compute the global number of docs lazily.	2019-10-25 18:12:29 +02:00
Tim Brooks	f5f1072824	Multiple remote connection strategy support (#48496 ) * Extract remote "sniffing" to connection strategy (#47253) Currently the connection strategy used by the remote cluster service is implemented as a multi-step sniffing process in the RemoteClusterConnection. We intend to introduce a new connection strategy that will operate in a different manner. This commit extracts the sniffing logic to a dedicated strategy class. Additionally, it implements dedicated tests for this class. Additionally, in previous commits we moved away from a world where the remote cluster connection was mutable. Instead, when setting updates are made, the connection is torn down and rebuilt. We still had methods and tests hanging around for the mutable behavior. This commit removes those. * Introduce simple remote connection strategy (#47480) This commit introduces a simple remote connection strategy which will open remote connections to a configurable list of user supplied addresses. These addresses can be remote Elasticsearch nodes or intermediate proxies. We will perform normal clustername and version validation, but otherwise rely on the remote cluster to route requests to the appropriate remote node. * Make remote setting updates support diff strategies (#47891) Currently the entire remote cluster settings infrastructure is designed around the sniff strategy. As we introduce an additional conneciton strategy this infrastructure needs to be modified to support it. This commit modifies the code so that the strategy implementations will tell the service if the connection needs to be torn down and rebuilt. As part of this commit, we will wait 10 seconds for new clusters to connect when they are added through the "update" settings infrastructure. * Make remote setting updates support diff strategies (#47891) Currently the entire remote cluster settings infrastructure is designed around the sniff strategy. As we introduce an additional conneciton strategy this infrastructure needs to be modified to support it. This commit modifies the code so that the strategy implementations will tell the service if the connection needs to be torn down and rebuilt. As part of this commit, we will wait 10 seconds for new clusters to connect when they are added through the "update" settings infrastructure.	2019-10-25 09:29:41 -06:00
Peter Dyson	eb44a25899	[DOCS] Reorder bullet items in CCS security docs (#48501 ) Adjust the last bullet item to be above the code block for better readability and to avoid it being skimmed over	2019-10-25 09:11:49 -04:00
Russ Cam	b24bbd4296	Change policy_id to list type in slm.get_lifecycle (#47766 ) This commit changes the REST API spec slm.get_lifecycle's policy_id url part to be of type "list", in line with other REST API specs that accept a comma-separated list of values. Closes #47765	2019-10-25 09:04:25 +10:00
Tim Brooks	c0b545f325	Make BytesReference an interface (#48486 ) BytesReference is currently an abstract class which is extended by various implementations. This makes it very difficult to use the delegation pattern. The implication of this is that our releasable BytesReference is a PagedBytesReference type and cannot be used as a generic releasable bytes reference that delegates to any reference type. This commit makes BytesReference an interface and introduces an AbstractBytesReference for common functionality.	2019-10-24 15:39:30 -06:00
Michael Basnight	d49958cef3	Remove deprecated test from the HLRC tests (#48424 ) The AbstractHlrcWriteableXContentTestCase was replaced by a better test case a while ago, and this is the last two instances using it. They have been converted and the test is now deleted. Ref #39745	2019-10-24 14:02:04 -05:00
Jake Landis	a4614daf46	Allow more time for restart tests to reach yellow state. (#48434 ) (#48480 ) The testWatcher method will on occasion timeout waiting for a yellow cluster state. This change increases the timeout to 60s.	2019-10-24 12:07:02 -05:00
Martijn van Groningen	b034153df7	Change grok watch dog to be Matcher based instead of thread based. (#48346 ) There is a watchdog in order to avoid long running (and expensive) grok expressions. Currently the watchdog is thread based, threads that run grok expressions are registered and after completion unregister. If these threads stay registered for too long then the watch dog interrupts these threads. Joni (the library that powers grok expressions) has a mechanism that checks whether the current thread is interrupted and if so abort the pattern matching. Newer versions have an additional method to abort long running pattern matching inside joni. Instead of checking the thread's interrupted flag, joni now also checks a volatile field that can be set via a `Matcher` instance. This is more efficient method for aborting long running matches. (joni checks each 30k iterations whether interrupted flag is set vs. just checking a volatile field) Recently we upgraded to a recent joni version (#47374), and this PR is a followup of that PR. This change should also fix #43673, since it appears when unit tests are ran the a test runner thread's interrupted flag may already have been set, due to some thread reuse.	2019-10-24 15:34:01 +02:00
Dimitrios Liappis	fc1b4ad23c	Mute testCCRUnfollowDuringSnapshot (#48464 ) tracked in #48461 backport of #48462	2019-10-24 15:52:56 +03:00
Przemysław Witek	149537a165	Assert that inference model has been persisted (#48332 ) (#48453 )	2019-10-24 14:18:43 +02:00
Dimitrios Liappis	4d0fb6e551	Mute testBasicTimeBasedRetenion (#48458 ) tracked in #48017 backport of #48456	2019-10-24 14:53:12 +03:00
Hendrik Muhs	ba1c13c47d	[Transform] do not fail checkpoint creation due to global checkpoint mismatch (#48423 ) Take the max if global checkpoints mismatch instead of throwing an exception. It turned out global checkpoints can mismatch by design fixes #48379	2019-10-24 12:22:07 +02:00
Ioannis Kakavas	c6b733f1b4	Add populate_user_metadata in OIDC realm (#48357 ) (#48438 ) Make populate_user_metadata configuration parameter available in the OpenID Connect authentication realm Resolves: #48217	2019-10-24 09:51:08 +03:00
Martijn van Groningen	05324b7f03	Muted verifying monitoring integration in enrich integration test. Relates to #48258	2019-10-24 08:39:53 +02:00
Julie Tibshirani	2664cbd20b	Deprecate the sparse_vector field type. (#48368 ) We have not seen much adoption of this experimental field type, and don't see a clear use case as it's currently designed. This PR deprecates the field type in 7.x. It will be removed from 8.0 in a follow-up PR.	2019-10-23 16:35:03 -07:00
Igor Motov	8163e0a9e5	Mute XPackRestIT security/authz/14_cat_indices Mutes "Test empty request while single authorized closed index" Tracked by #47875	2019-10-23 14:17:44 -04:00
Jake Landis	cf175da5a9	Ensure SLM stats does not block an in-place upgrade from 7.4 (… (#48411 ) 7.5+ for SLM requires [stats] object to exist in the cluster state. When doing an in-place upgrade from 7.4 to 7.5+ [stats] does not exist in cluster state, result in an exception on startup [1]. This commit moves the [stats] to be an optional object in the parser and if not found will default to an empty stats object. [1] Caused by: java.lang.IllegalArgumentException: Required [stats]	2019-10-23 11:21:39 -05:00
Przemyslaw Gomulka	aaa6209be6	[7.x] [Java.time] Calculate week of a year with ISO rules BACKPORT(#48209 ) (#48349 ) Reverting the change introducing IsoLocal.ROOT and introducing IsoCalendarDataProvider that defaults start of the week to Monday and requires minimum 4 days in first week of a year. This extension is using java SPI mechanism and defaults for Locale.ROOT only. It require jvm property java.locale.providers to be set with SPI,COMPAT closes #41670 backport #48209	2019-10-23 17:39:38 +02:00
James Rodewig	852622d970	[DOCS] Remove binary gendered language (#48362 )	2019-10-23 09:37:12 -05:00
Ioannis Kakavas	cece5f24f7	Add sections in SAML Troubleshooting (#47964 ) (#48387 ) - Section about the case where the `principal` user property can't be mapped. - Section about when the IdP SAML metadata do not contain a SingleSignOnService that supports HTTP-Redirect binding. Co-Authored-By: Lisa Cawley <lcawley@elastic.co> Co-Authored-By: Tim Vernum <tim@adjective.org>	2019-10-23 17:24:04 +03:00
Ioannis Kakavas	834f2b4546	Add brackets where necessary in error messages (#48140 ) (#48386 ) This commit attempts to help error readability by adding brackets where applicable/missing in saml errors.	2019-10-23 17:23:50 +03:00
Armin Braun	7215201406	Track Shard-Snapshot Index Generation at Repository Root (#48371 ) This change adds a new field `"shards"` to `RepositoryData` that contains a mapping of `IndexId` to a `String[]`. This string array can be accessed by shard id to get the generation of a shard's shard folder (i.e. the `N` in the name of the currently valid `/indices/${indexId}/${shardId}/index-${N}` for the shard in question). This allows for creating a new snapshot in the shard without doing any LIST operations on the shard's folder. In the case of AWS S3, this saves about 1/3 of the cost for updating an empty shard (see #45736) and removes one out of two remaining potential issues with eventually consistent blob stores (see #38941 ... now only the root `index-${N}` is determined by listing). Also and equally if not more important, a number of possible failure modes on eventually consistent blob stores like AWS S3 are eliminated by moving all delete operations to the `master` node and moving from incremental naming of shard level index-N to uuid suffixes for these blobs. This change moves the deleting of the previous shard level `index-${uuid}` blob to the master node instead of the data node allowing for a safe and consistent update of the shard's generation in the `RepositoryData` by first updating `RepositoryData` and then deleting the now unreferenced `index-${newUUID}` blob. __No deletes are executed on the data nodes at all for any operation with this change.__ Note also: Previous issues with hanging data nodes interfering with master nodes are completely impossible, even on S3 (see next section for details). This change changes the naming of the shard level `index-${N}` blobs to a uuid suffix `index-${UUID}`. The reason for this is the fact that writing a new shard-level `index-` generation blob is not atomic anymore in its effect. Not only does the blob have to be written to have an effect, it must also be referenced by the root level `index-N` (`RepositoryData`) to become an effective part of the snapshot repository. This leads to a problem if we were to use incrementing names like we did before. If a blob `index-${N+1}` is written but due to the node/network/cluster/... crashes the root level `RepositoryData` has not been updated then a future operation will determine the shard's generation to be `N` and try to write a new `index-${N+1}` to the already existing path. Updates like that are problematic on S3 for consistency reasons, but also create numerous issues when thinking about stuck data nodes. Previously stuck data nodes that were tasked to write `index-${N+1}` but got stuck and tried to do so after some other node had already written `index-${N+1}` were prevented form doing so (except for on S3) by us not allowing overwrites for that blob and thus no corruption could occur. Were we to continue using incrementing names, we could not do this. The stuck node scenario would either allow for overwriting the `N+1` generation or force us to continue using a `LIST` operation to figure out the next `N` (which would make this change pointless). With uuid naming and moving all deletes to `master` this becomes a non-issue. Data nodes write updated shard generation `index-${uuid}` and `master` makes those `index-${uuid}` part of the `RepositoryData` that it deems correct and cleans up all those `index-` that are unused. Co-authored-by: Yannick Welsch <yannick@welsch.lu> Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>	2019-10-23 10:58:26 +01:00
Hendrik Muhs	5ae7453878	[7.6][Transform] blacklist continuous transform tests if upgraded from 7.2.x (#48344 ) blacklist continuous transform tests if upgraded from 7.2.x fixes #48336	2019-10-22 13:16:12 +02:00
Przemysław Witek	60d8ecb2b7	Mute ClassificationIT tests (#48338 ) (#48339 )	2019-10-22 12:45:50 +02:00
Ioannis Kakavas	24e43dfa34	[7.x] Refactor FIPS BootstrapChecks to simple checks (#47499 ) (#48333 ) FIPS 140 bootstrap checks should not be bootstrap checks as they are always enforced. This commit moves the validation logic within the security plugin. The FIPS140SecureSettingsBootstrapCheck was not applicable as the keystore was being loaded on init, before the Bootstrap checks were checked, so an elasticsearch keystore of version < 3 would cause the node to fail in a FIPS 140 JVM before the bootstrap check kicked in, and as such hasn't been migrated. Resolves: #34772	2019-10-22 12:49:01 +03:00
Andrei Stefan	3233b59b68	Add "format" to "range" queries resulted from optimizing a logical AND (#48073 ) (cherry picked from commit 020939a9bd5b34c6d540faa8b3a67b740d661be3)	2019-10-22 10:17:37 +03:00
Hendrik Muhs	1cb3b0cc0d	[7.6][Transform] separate old and mixed rolling upgrade tests (#48302 ) separates rolling upgrade tests for transforms created on old and mixed clusters and disable testing transforms on mixed clusters for <7.4.	2019-10-22 08:58:02 +02:00
Martijn van Groningen	bbe50eca72	Fail with a better error when if there are no ingest nodes (#48272 ) when executing enrich execute policy api.	2019-10-22 07:42:04 +02:00
Martijn van Groningen	0ec0ab64c9	Fix executing enrich policies stats (#48132 ) The enrich stats api picked the wrong task to be displayed in the executing stats section. In case `wait_for_completion` was set to `false` then no task was being displayed and if that param was set to `true` then the wrong task was being displayed (transport action task instead of enrich policy executor task). Testing executing policies in enrich stats api is tricky. I have verified locally that this commit fixes the bug.	2019-10-22 07:41:56 +02:00
Martijn van Groningen	c09b62d5bf	Backport: also validate source index at put enrich policy time (#48311 ) Backport of: #48254 This changes tests to create a valid source index prior to creating the enrich policy.	2019-10-22 07:38:16 +02:00
Nhat Nguyen	d0a4bad95b	Use MultiFileTransfer in CCR remote recovery (#44514 ) Relates #44468	2019-10-21 23:30:52 -04:00
James Baiera	0d12ef8958	Add Enrich Origin (#48098 ) (#48312 ) This PR adds an origin for the Enrich feature, and modifies the background maintenance task to use the origin when executing client operations. Without this fix, the maintenance task fails to execute when security is enabled.	2019-10-21 16:40:49 -04:00
Przemysław Witek	2db2b945ec	[7.x] Change format of MulticlassConfusionMatrix result to be more self-explanatory (#48174 ) (#48294 )	2019-10-21 22:07:19 +02:00
Armin Braun	e65c60915a	Cleanup FileRestoreContext Abstractions (#48173 ) (#48300 ) This class is only used by the blob store repository and CCR and the abstractions didn't really make sense with CCR ignoring the concrete `restoreFiles` method completely and having a method used only by the blobstore overriden as unsupported. => Moved to a more fitting set of abstractions => Dried up the stream wrapping in `BlobStoreRepository` a little now that the `restoreFile` method could be simplified Relates #48110 as it makes changing the API of `FileRestoreContext` to what is needed for async restores simpler	2019-10-21 17:30:35 +02:00

1 2 3 4 5 ...

4152 Commits