OpenSearch

Commit Graph

Author	SHA1	Message	Date
Dimitris Athanasiou	f2d4c94a9c	[7.x][ML] Deduplicate multi-fields for data frame analytics (#48799 ) (#48806 ) In the case multi-fields exist in the source index, we pick all variants of them in our extracted fields detection for data frame analytics. This means we may have multiple instances of the same feature. The worse consequence of this is when the dependent variable (for regression or classification) is also duplicated which means we train a model on the dependent variable itself. Now that #48770 is merged, this commit is adding logic to only select one variant of multi-fields. Closes #48756 Backport of #48799	2019-11-01 16:53:05 +02:00
Tim Vernum	fd4ae697b8	Fix indentation of "except" in role mapping doc "except" is a type of rule, and should be indented accordingly.	2019-11-01 10:46:15 -04:00
Dan Hermann	3604add5c9	[7.x] Validate monitoring username at parse time (#48774 )	2019-11-01 09:02:37 -05:00
Andrei Dan	98a9227588	Fix TimeSeriesLifecycleActionsIT.testRolloverAlreadyExists (#48747 ) (#48795 ) * ILM Test asserts on the same ilm/_explain output With the introduction of retryable steps subsequent ilm/_explain calls can see the state of an ilm cycle move out of the error step. This test made several assertions assuming that the cycle remains in the error step so this commit changes the test to make one _explain call and have all the asserts work on the same ilm state (so subsequent assumptions to the cycle being in the error step are valid). * Drop unused field in test. (cherry picked from commit 44c74bb487151c886a08b27f32b13f7a72056997) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-11-01 12:34:33 +00:00
Dimitris Athanasiou	1f662e0b12	[7.x][ML] Prevent fetching multi-field from source (#48770 ) (#48797 ) Aggregatable mutli-fields are at the moment wrongly mapped as normal doc_value fields and thus they support fetching from source. However, they do not exist in the source. This results to failure to extract such fields. This commit fixes this bug. While a fix could be worked out on top of the existing code, it is evident the extraction logic has become difficult to understand and maintain. As we also want to deduplicate multi-fields for data frame analytics, it seemed appropriate to refactor the code to simplify and better handle the extraction of multi-fields. Relates #48756 Backport of #48770	2019-11-01 14:18:03 +02:00
Andrei Stefan	e1e9b23db8	Cleanup static instance in @AfterClass	2019-10-31 23:24:40 -04:00
Andrei Stefan	2c73c7dfe3	SQL: binary communication implementation for drivers and the CLI (#48261 ) * Introduce binary_format request parameter (binary.format for JDBC) to disable binary communication between clients (jdbc/odbc) and server. * for CLI - "binary" command line parameter (or -b) is introduced. Default value is "true". * binary communication (cbor) is enabled by default * disabling request parameter introduced for debugging purposes only (cherry picked from commit f96a5ca61cb9fad9ed59357320af20e669348ce7)	2019-10-31 20:39:41 -04:00
Tal Levy	4be54402de	[7.x] Add ingest info to Cluster Stats (#48485 ) (#48661 ) * Add ingest info to Cluster Stats (#48485) This commit enhances the ClusterStatsNodes response to include global processor usage stats on a per-processor basis. example output: ``` ... "processor_stats": { "gsub": { "count": 0, "failed": 0 "current": 0 "time_in_millis": 0 }, "script": { "count": 0, "failed": 0 "current": 0, "time_in_millis": 0 } } ... ``` The purpose for this enhancement is to make it easier to collect stats on how specific processors are being used across the cluster beyond the current per-node usage statistics that currently exist in node stats. Closes #46146. * fix BWC of ingest stats The introduction of processor types into IngestStats had a bug. It was set to `null` and set as the key to the map. This would throw a NPE. This commit resolves this by setting all the processor types from previous versions that are not serializing it out to `_NOT_AVAILABLE`.	2019-10-31 14:36:54 -07:00
Lee Hinman	d0ead688c3	[7.x] Fix TimeSeriesLifecycleActionsIT.testExplainFilters (#48… (#48776 ) This test used an index without an alias to simulate a failure in the `check-rollover-ready` step. However, with #48256 that step automatically retries, meaning that the index may not always be in the ERROR step. This commit changes the test to use a shrink action with an invalid number of shards so that it stays in the ERROR step. Resolves #48767	2019-10-31 15:25:12 -06:00
Ioannis Kakavas	99aedc844d	Copy http headers to ThreadContext strictly (#45945 ) (#48675 ) Previous behavior while copying HTTP headers to the ThreadContext, would allow multiple HTTP headers with the same name, handling only the first occurrence and disregarding the rest of the values. This can be confusing when dealing with multiple Headers as it is not obvious which value is read and which ones are silently dropped. According to RFC-7230, a client must not send multiple header fields with the same field name in a HTTP message, unless the entire field value for this header is defined as a comma separated list or this specific header is a well-known exception. This commits changes the behavior in order to be more compliant to the aforementioned RFC by requiring the classes that implement ActionPlugin to declare if a header can be multi-valued or not when registering this header to be copied over to the ThreadContext in ActionPlugin#getRestHeaders. If the header is allowed to be multivalued, then all such headers are read from the HTTP request and their values get concatenated in a comma-separated string. If the header is not allowed to be multivalued, and the HTTP request contains multiple such Headers with different values, the request is rejected with a 400 status.	2019-10-31 23:05:12 +02:00
Andrey Ershov	088988bb37	GCS snapshot cleanup tool backport to 7.x (#48750 ) This is the backport of #45076 with dependent changes.	2019-10-31 18:21:36 +03:00
Alexander Reelsen	4ecf234617	Upgrade to joda 2.10.4 (#47805 )	2019-10-31 14:49:50 +01:00
emasab	185e067442	SQL: Failing Group By queries due to different ExpressionIds (#43072 ) Fix an issue that arises from the use of ExpressionIds as keys in a lookup map that helps the QueryTranslator to identify the grouping columns. The issue is that the same expression in different parts of the query (SELECT clause and GROUP BY clause) ends up with different ExpressionIds so the lookup fails. So, instead of ExpressionIds use the hashCode() of NamedExpression. Fixes: #41159 Fixes: #40001 Fixes: #40240 Fixes: #33361 Fixes: #46316 Fixes: #36074 Fixes: #34543 Fixes: #37044 Fixes: #42041 (cherry picked from commit 3c38ea555984fcd2c6bf9e39d0f47a01b09e7c48)	2019-10-31 14:49:16 +01:00
Martijn van Groningen	c358ecb5fb	Don't preserve indices between enrich qa tests. This was added because it was suspected to cause the monitoring enrich verification to fail, but that is not the case. See #48258	2019-10-31 14:23:56 +01:00
Andrei Dan	ffe5d5417f	ILM Make the `check-rollover-ready` step retryable (#48256 ) (#48740 ) This adds the infrastructure to be able to retry the execution of retryable steps and makes the `check-rollover-ready` retryable as an initial step to make the rollover action more resilient to transient errors. (cherry picked from commit 454020ac8acb147eae97acb4ccd6fb470d1e5f48) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-10-31 11:28:55 +00:00
Albert Zaharovits	00d3151eea	Document allow_restricted_indices for indices privileges (#47514 ) Document the allow_restricted_indices role descriptor field.	2019-10-31 11:45:11 +02:00
David Roberts	c3063c4e1f	[ML] Make the URL of the ML C++ Ivy repo configurable (#48702 ) At present the ML C++ artifact is always downloaded from S3. This change adds an option to configure the location. (The intention is to use a file:/// URL to pick up the artifact built in a Docker container in ml-cpp PR builds so that C++ changes that will break Java integration tests can be detected before the ml-cpp PRs are merged.) Relates elastic/ml-cpp#766	2019-10-31 09:21:44 +00:00
Dimitris Athanasiou	919596b2e8	[7.x][ML] Move field extraction logic to its own package (#48709 ) (#48712 ) Moves common field extraction logic to its own package so that it can be used both for anomaly detection and data frame analytics. In preparation for refactoring extraction fields to be simpler and to support multi-fields properly. Backport of #48709	2019-10-31 02:41:00 +02:00
Yogesh Gaikwad	c7342dde29	Fix to release system resource after reading JKWSet file (#48666 ) (#48677 ) When we load a JSON Web Key (JWKSet) from the specified file using JWKSet.load it internally uses IOUtils.readFileToString but the opened FileInputStream is never closed after usage. https://bitbucket.org/connect2id/nimbus-jose-jwt/issues/342 This commit reads the file and parses the JWKSet from the string. This also fixes an issue wherein if the underlying file changed, for every change event it would add another file watcher. The change is to only add the file watcher at the start. Closes #44942	2019-10-31 10:16:33 +11:00
Lee Hinman	2d5291cf3b	Un-AwaitsFix and enhance logging for testPolicyCRUD (#48719 ) * Un-AwaitsFix and enhance logging for testPolicyCRUD This removes the `AwaitsFix` and increases the test logging for `SnapshotLifecycleServiceTests.testPolicyCRUD` in an effort to track down the cause of #44997. * Remove unused import	2019-10-30 17:02:57 -06:00
Julie Tibshirani	ae1ef5fd92	Refactor unit tests for vector functions. (#48662 ) This PR performs the following changes: * Split `ScoreScriptUtilsTests` into `DenseVectorFunctionTests` and `SparseVectorFunctionTests`. This will make it easier to delete all sparse vector function tests once we remove support on 8.x. * As much as possible, break up the large test methods into individual tests for each vector function (`cosineSimilarity`, `l2norm`, etc.).	2019-10-30 15:36:06 -07:00
Lee Hinman	ed2bb73de2	Fix SnapshotLifecycleService logger (#48711 ) The logger was erroneously using the `SnapshotLifecycleMetadata` class for its initialization, making it hard to target packages for logging levels since `SnapshotLifecycleMetadata` is in a different package.	2019-10-30 13:13:50 -06:00
Benjamin Trent	c9ead80c31	[7.x] [ML][Inference] separating definition and config object storage (#48651 ) (#48695 ) * [ML][Inference] separating definition and config object storage (#48651) This separates out the `definition` object from being stored within the configuration object in the index. This allows us to gather the config object without decompressing a potentially large definition. Additionally, `input` is moved to the TrainedModelConfig object and out of the definition. This is so the trained input fields are accessible outside the potentially large model definition.	2019-10-30 13:27:29 -04:00
Lee Hinman	72a601c47f	[7.x] Don't schedule SLM jobs when services have been stopped… (#48692 ) This adds a guard for the SLM lifecycle and retention service that prevents new jobs from being scheduled once the service has been stopped. Previous if the node were shut down the service would be stopped, but a cluster state or local master election would cause a job to attempt to be scheduled. This could lead to an uncaught `RejectedExecutionException`. Resolves #47749	2019-10-30 09:46:35 -06:00
Armin Braun	52e5ceb321	Restore from Individual Shard Snapshot Files in Parallel (#48110 ) (#48686 ) Make restoring shard snapshots run in parallel on the `SNAPSHOT` thread-pool.	2019-10-30 14:36:30 +01:00
Yogesh Gaikwad	9ed7352a12	Add Sysprop to Adjust IO Buffer Size (#48267 ) (#48667 ) The 1MB IO-buffer size per transport thread is causing trouble in some tests, albeit at a low rate. Reducing the number of transport threads was not enough to fully fix this situation. Allowing to configure the size of the buffer and reducing it by more than an order of magnitude should fix these tests. Closes #46803	2019-10-30 14:19:54 +11:00
Yogesh Gaikwad	1b64c1992a	Add owner flag parameter to the rest spec (#48500 ) This commit adds missing info about newly added `owner` flag to the rest spec, also adds a rest test for the same. Closes#48499	2019-10-30 13:07:01 +11:00
Julie Tibshirani	89c65752dc	Update the signature of vector script functions. (#48653 ) Previously the functions accepted a doc values reference, whereas they now accept the name of the vector field. Here's an example of how a vector function was called before and after the change. ``` Before: cosineSimilarity(params.query_vector, doc['field']) After: cosineSimilarity(params.query_vector, 'field') ``` This seems more intuitive, since we don't allow direct access to vector doc values and the the meaning of `doc['field']` is unclear. The PR makes the following changes (broken into distinct commits): * Add new function signatures of the form `function(params.query_vector, 'field')` and deprecates the old ones. Because Painless doesn't allow two methods with the same name and number of arguments, we allow a generic `Object` to be passed in to the function and decide on the behavior through an `instanceof` check. * Refactor the class bindings so that the document field is passed to the constructor instead of the instance method. This allows us to avoid retrieving the vector doc values on every function invocation, which gives a tiny speed-up in benchmarks. Note that this PR adds new signatures for the sparse vector functions too, even though sparse vectors are deprecated. It seemed simplest to understand (for both us and users) to keep everything symmetric between dense and sparse vectors.	2019-10-29 15:46:05 -07:00
Gordon Brown	25724c5c46	Adjust date parsing in ILM integration tests (#48648 ) The format returned by the API is not always parsable with `Instant.parse()`, so this commit adjusts to parsing those dates as `ISO_ZONED_DATE_TIME` instead, which appears to always parse the returned value correctly.	2019-10-29 15:44:04 -07:00
Gordon Brown	50d7424e7d	Unmute and increase logging on flaky SLM tests (#48612 ) The failures in these tests have been remarkably difficult to track down, in part because they will not reproduce locally. This commit unmutes the flaky tests and increases logging, as well as introducing some additional logging, to attempt to pin down the failures.	2019-10-29 13:39:19 -07:00
Andrei Dan	8b22e297ed	ILM open/close steps are noop if idx is open/close (#48614 ) (#48640 ) The open and close follower steps didn't check if the index is open, closed respectively, before executing the open/close request. This changes the steps to check the index state and only perform the open/close operation if the index is not already open/closed.	2019-10-29 17:43:56 +00:00
Lisa Cawley	be9df101bf	[DOCS] Adds missing references to oidc realms (#48224 )	2019-10-29 09:41:34 -07:00
Gordon Brown	cf235796c0	Use more reliable "never run" cron pattern in tests (#48608 ) The cron schedule "1 2 3 4 5 ?" will run every May 4 at 03:02:01, which may result in unnecessary test failures once a year. This commit switches out uses of that schedule in tests for one which will never execute (because it specifies a day which doesn't exist, Feb. 31). Also factors the schedule out to a constant to make the intent clearer.	2019-10-29 09:33:14 -07:00
Przemysław Witek	7c944d26c5	[7.x] Assert that the results of classification analysis can be evaluated using _evaluate API. (#48626 ) (#48634 )	2019-10-29 16:20:56 +01:00
Ioannis Kakavas	a0362153e2	Update oauth2-oidc-sdk and nimbus-jose-jwt (#48537 ) (#48628 ) Update two dependencies for our OpenID Connect realm implementation to their latest versions	2019-10-29 14:18:59 +02:00
Yannick Welsch	790cfc8ad2	Fix upgraded_scroll test (#48525 ) I think the problem is that the master is trying to relocate the "upgraded_scroll" shard back to the node on which it was previously allocated, but to which it can't be allocated now due to the shard lock being held because of an in-progress scroll. As the master keeps on retrying and retrying (and indefinitely tries so because max_retries does not apply to relocations, it blocks any other lower-prioritized task from completing, which leads to the rolling upgrade tests failing (see #48395). Closes #48395	2019-10-29 08:10:40 +01:00
Cris da Rocha	947f89a3a1	Update troubleshooting.asciidoc (#48516 )	2019-10-28 18:44:24 -07:00
Mark Vieira	e5c6440a4f	Simplify usage of Gradle Shadow plugin (#48478 ) (#48597 ) This commit simplifies and standardizes our usage of the Gradle Shadow plugin to conform more to plugin conventions. The custom "bundle" plugin has been removed as it's not necessary and performs the same function as the Shadow plugin's default behavior with existing configurations. Additionally, this removes unnecessary creation of a "nodeps" artifact, which is unnecessary because by default project dependencies will in fact use the non-shadowed JAR unless explicitly depending on the "shadow" configuration. Finally, we've cleaned up the logic used for unit testing, so we are now correctly testing against the shadow JAR when the plugin is applied. This better represents a real-world scenario for consumers and provides better test coverage for incorrectly declared dependencies. (cherry picked from commit 3698131109c7e78bdd3a3340707e1c7b4740d310)	2019-10-28 12:11:55 -07:00
Benjamin Trent	6ea59dd428	[ML][Transforms] add wait_for_checkpoint flag to stop (#47935 ) (#48591 ) Adds `wait_for_checkpoint` for `_stop` API.	2019-10-28 13:02:57 -04:00
Gordon Brown	5021410165	Retry on RepositoryException in SLM tests (#48548 ) Due to a bug, GETing a snapshot can cause a RespositoryException to be thrown. This error is transient and should be retried, rather than causing the test to fail. This commit converts those RepositoryExceptions into AssertionErrors so that they will be retried in code wrapped in assertBusy.	2019-10-28 09:24:38 -07:00
Gordon Brown	c353ad71fe	Wrap ResponseException in AssertionError in ILM/CCR tests (#48489 ) When checking for the existence of a document in the ILM/CCR integration tests, `assertDocumentExists` makes an HTTP request and checks the response code. However, if the repsonse code is not successful, the call will throw a `ResponseException`. `assertDocumentExists` is often called inside an `assertBusy`, and wrapping the `ResponseException` in an `AssertionError` will allow the `assertBusy` to retry. In particular, this fixes an issue with `testCCRUnfollowDuringSnapshot` where the index in question may still be closed when the document is requested.	2019-10-28 07:37:52 -07:00
Marios Trivyzas	124f6d098b	SQL: [Tests] Renable CliSecurityIT (#48581 ) Seems that the issue has been fixed with: #48098 Closes: #48117 (cherry picked from commit 470362361ffce794a6a12ce7a81a8029ec7d54de)	2019-10-28 15:08:38 +01:00
Przemysław Witek	7e30277a37	Mute RegressionIT.testStopAndRestart (#48575 ) (#48576 )	2019-10-28 13:08:11 +01:00
Rory Hunter	30389c6660	Improve SAML tests resiliency to auto-formatting (#48517 ) Backport of #48452. The SAML tests have large XML documents within which various parameters are replaced. At present, if these test are auto-formatted, the XML documents get strung out over many, many lines, and are basically illegible. Fix this by using named placeholders for variables, and indent the multiline XML documents. The tests in `SamlSpMetadataBuilderTests` deserve a special mention, because they include a number of certificates in Base64. I extracted these into variables, for additional legibility.	2019-10-27 16:06:23 +00:00
Jim Ferenczi	7fc413c22c	Resolve the role query and the number of docs lazily (#48036 ) This commit ensures that the creation of a DocumentSubsetReader does not eagerly resolve the role query and the number of docs that match. We want to delay this expensive operation in order to ensure that we really need this information when we build it. For this reason the role query and the number of docs are now resolved on demand. This commit also depends on https://issues.apache.org/jira/browse/LUCENE-9003 that will also compute the global number of docs lazily.	2019-10-25 18:12:29 +02:00
Tim Brooks	f5f1072824	Multiple remote connection strategy support (#48496 ) * Extract remote "sniffing" to connection strategy (#47253) Currently the connection strategy used by the remote cluster service is implemented as a multi-step sniffing process in the RemoteClusterConnection. We intend to introduce a new connection strategy that will operate in a different manner. This commit extracts the sniffing logic to a dedicated strategy class. Additionally, it implements dedicated tests for this class. Additionally, in previous commits we moved away from a world where the remote cluster connection was mutable. Instead, when setting updates are made, the connection is torn down and rebuilt. We still had methods and tests hanging around for the mutable behavior. This commit removes those. * Introduce simple remote connection strategy (#47480) This commit introduces a simple remote connection strategy which will open remote connections to a configurable list of user supplied addresses. These addresses can be remote Elasticsearch nodes or intermediate proxies. We will perform normal clustername and version validation, but otherwise rely on the remote cluster to route requests to the appropriate remote node. * Make remote setting updates support diff strategies (#47891) Currently the entire remote cluster settings infrastructure is designed around the sniff strategy. As we introduce an additional conneciton strategy this infrastructure needs to be modified to support it. This commit modifies the code so that the strategy implementations will tell the service if the connection needs to be torn down and rebuilt. As part of this commit, we will wait 10 seconds for new clusters to connect when they are added through the "update" settings infrastructure. * Make remote setting updates support diff strategies (#47891) Currently the entire remote cluster settings infrastructure is designed around the sniff strategy. As we introduce an additional conneciton strategy this infrastructure needs to be modified to support it. This commit modifies the code so that the strategy implementations will tell the service if the connection needs to be torn down and rebuilt. As part of this commit, we will wait 10 seconds for new clusters to connect when they are added through the "update" settings infrastructure.	2019-10-25 09:29:41 -06:00
Peter Dyson	eb44a25899	[DOCS] Reorder bullet items in CCS security docs (#48501 ) Adjust the last bullet item to be above the code block for better readability and to avoid it being skimmed over	2019-10-25 09:11:49 -04:00
Russ Cam	b24bbd4296	Change policy_id to list type in slm.get_lifecycle (#47766 ) This commit changes the REST API spec slm.get_lifecycle's policy_id url part to be of type "list", in line with other REST API specs that accept a comma-separated list of values. Closes #47765	2019-10-25 09:04:25 +10:00
Tim Brooks	c0b545f325	Make BytesReference an interface (#48486 ) BytesReference is currently an abstract class which is extended by various implementations. This makes it very difficult to use the delegation pattern. The implication of this is that our releasable BytesReference is a PagedBytesReference type and cannot be used as a generic releasable bytes reference that delegates to any reference type. This commit makes BytesReference an interface and introduces an AbstractBytesReference for common functionality.	2019-10-24 15:39:30 -06:00
Michael Basnight	d49958cef3	Remove deprecated test from the HLRC tests (#48424 ) The AbstractHlrcWriteableXContentTestCase was replaced by a better test case a while ago, and this is the last two instances using it. They have been converted and the test is now deleted. Ref #39745	2019-10-24 14:02:04 -05:00

1 2 3 4 5 ...

4180 Commits