OpenSearch

Commit Graph

Author	SHA1	Message	Date
Armin Braun	0acba44a2e	Make Repository.getRepositoryData an Async API (#49299 ) (#49312 ) This API call in most implementations is fairly IO heavy and slow so it is more natural to be async in the first place. Concretely though, this change is a prerequisite of #49060 since determining the repository generation from the cluster state introduces situations where this call would have to wait for other operations to finish. Doing so in a blocking manner would break `SnapshotResiliencyTests` and waste a thread. Also, this sets up the possibility to in the future make use of async IO where provided by the underlying Repository implementation. In a follow-up `SnapshotsService#getRepositoryData` will be made async as well (did not do it here, since it's another huge change to do so). Note: This change for now does not alter the threading behaviour in any way (since `Repository#getRepositoryData` isn't forking) and is purely mechanical.	2019-11-19 16:49:12 +01:00
Marios Trivyzas	fd1bb4a33a	SQL: Fix issue with mins & hours for DATEDIFF (#49252 ) Previously, DATEDIFF for minutes and hours was doing a rounding calculation using all the time fields (secs, msecs/micros/nanos). Instead it should first truncate the 2 dates to the respective field (mins or hours) zeroing out all the more detailed time fields and then make the subtraction. (cherry picked from commit 124cd18e20429e19d52fd8dc383827ea5132d428)	2019-11-19 14:25:28 +01:00
Benjamin Trent	19602fd573	[ML][Inference] changing setting to be memorySizeSettting (#49259 ) (#49302 )	2019-11-19 07:56:40 -05:00
Przemysław Witek	38aec2e298	Relax assertions related to datafeed timing stats in .yml test (#49285 ) (#49291 )	2019-11-19 12:50:14 +01:00
David Roberts	a5204c1c80	[ML] Fixes for stop datafeed edge cases (#49284 ) The following edge cases were fixed: 1. A request to force-stop a stopping datafeed is no longer ignored. Force-stop is an important recovery mechanism if normal stop doesn't work for some reason, and needs to operate on a datafeed in any state other than stopped. 2. If the node that a datafeed is running on is removed from the cluster during a normal stop then the stop request is retried (and will likely succeed on this retry by simply cancelling the persistent task for the affected datafeed). 3. If there are multiple simultaneous force-stop requests for the same datafeed we no longer fail the one that is processed second. The previous behaviour was wrong as stopping a stopped datafeed is not an error, so stopping a datafeed twice simultaneously should not be either. Backport of #49191	2019-11-19 10:51:46 +00:00
Lisa Cawley	abd4a70b10	[DOCS] Merges duplicate pages for Kerberos realms (#49207 )	2019-11-18 15:23:06 -08:00
Lisa Cawley	b4f82c9cdb	[DOCS] Merges duplicate pages for LDAP realms (#49203 )	2019-11-18 14:09:24 -08:00
Julie Tibshirani	a0ee6c8f7e	Add telemetry for flattened fields. (#48972 ) (#49125 ) Currently we just record the number of flattened fields defined in the mappings.	2019-11-18 12:29:42 -08:00
Lisa Cawley	b0054eecd6	[DOCS] Merges duplicate pages for file realms (#49200 )	2019-11-18 12:02:18 -08:00
Benjamin Trent	eefe7688ce	[7.x][ML] ML Model Inference Ingest Processor (#49052 ) (#49257 ) * [ML] ML Model Inference Ingest Processor (#49052) * [ML][Inference] adds lazy model loader and inference (#47410) This adds a couple of things: - A model loader service that is accessible via transport calls. This service will load in models and cache them. They will stay loaded until a processor no longer references them - A Model class and its first sub-class LocalModel. Used to cache model information and run inference. - Transport action and handler for requests to infer against a local model Related Feature PRs: * [ML][Inference] Adjust inference configuration option API (#47812) * [ML][Inference] adds logistic_regression output aggregator (#48075) * [ML][Inference] Adding read/del trained models (#47882) * [ML][Inference] Adding inference ingest processor (#47859) * [ML][Inference] fixing classification inference for ensemble (#48463) * [ML][Inference] Adding model memory estimations (#48323) * [ML][Inference] adding more options to inference processor (#48545) * [ML][Inference] handle string values better in feature extraction (#48584) * [ML][Inference] Adding _stats endpoint for inference (#48492) * [ML][Inference] add inference processors and trained models to usage (#47869) * [ML][Inference] add new flag for optionally including model definition (#48718) * [ML][Inference] adding license checks (#49056) * [ML][Inference] Adding memory and compute estimates to inference (#48955) * fixing version of indexed docs for model inference	2019-11-18 13:19:17 -05:00
Lisa Cawley	48f53efd9a	[DOCS] Merges duplicate pages for SAML realms (#49209 )	2019-11-18 10:09:29 -08:00
Armin Braun	25cc8e3663	Fix RepoCleanup not Removed on Master-Failover (#49217 ) (#49239 ) The logic for `cleanupInProgress()` was backwards everywhere (method itself and all but one user). Also, we weren't checking it when removing a repository. This lead to a bug (in the one spot that didn't use the method backwards) that prevented the cleanup cluster state entry from ever being removed from the cluster state if master failed over during the cleanup process. This change corrects the backwards logic, adds a test that makes sure the cleanup is always removed and adds a check that prevents repository removal during cleanup to the repositories service. Also, the failure handling logic in the cleanup action was broken. Repeated invocation would lead to the cleanup being removed from the cluster state even if it was in progress. Fixed by adding a flag that indicates whether or not any removal of the cleanup task from the cluster state must be executed. Sorry for mixing this in here, but I had to fix it in the same PR, as the first test (for master-failover) otherwise would often just delete the blocked cleanup action as a result of a transport master action retry.	2019-11-18 16:44:09 +01:00
Przemysław Witek	5f9965e4b8	Lower minimum model memory limit value from 1MB to 1kB. (#49227 ) (#49242 )	2019-11-18 14:58:20 +01:00
Hendrik Muhs	ca912624ec	[Transform] improve error handling of script errors (#48887 ) improve error handling for script errors, treating it as irrecoverable errors which puts the task immediately into failed state, also improves the error extraction to properly report the script error. fixes #48467	2019-11-18 10:24:39 +01:00
Tanguy Leroux	fcac3fbfd9	AutoFollowIT should not rely on assertBusy but should use latches instead (#49141 ) AutoFollowIT relies on assertBusy() calls to wait for a given number of leader indices to be created but this is prone to failures on CI. Instead, we should use latches to indicate when auto-follow patterns must be paused and resumed.	2019-11-18 09:40:56 +01:00
Dimitris Athanasiou	805c31e19e	[7.x][ML] Avoid NPE when node load is calculated on job assignment (#49186 ) (#49214 ) This commit fixes a NPE problem as reported in #49150. But this problem uncovered that we never added proper handling of state for data frame analytics tasks. In this commit we improve the `MlTasks.getDataFrameAnalyticsState` method to handle null tasks and state tasks properly. Closes #49150 Backport of #49186	2019-11-18 10:33:07 +02:00
Przemysław Witek	150db2b544	Throw an exception when memory usage estimation endpoint encounters empty data frame. (#49143 ) (#49164 )	2019-11-18 07:52:57 +01:00
Jason Tedor	60d1d67aac	CCR should auto-retry rejected execution exceptions (#49213 ) If CCR encounters a rejected execution exception, today we treat this as fatal. This is not though, as the stuffed queue could drain. Requiring an administrator to manually restart the follow tasks that faced such an exception is a burden. This commit addresses this by making CCR auto-retry on rejected execution exceptions.	2019-11-17 12:48:46 -05:00
Lisa Cawley	09a9ec4d23	[DOCS] Merges duplicate pages for native realms (#49198 )	2019-11-15 15:35:53 -08:00
Mayya Sharipova	0e933a093d	Add index name to search requests (#49175 ) We can't guarantee expected request failures if search request is across many indexes, as if expected shards fail, some indexes may return 200. closes #47743	2019-11-15 16:39:18 -05:00
Jay Modi	57f57227ac	Clean up static web server in sql-client tests (#49187 ) (#49197 ) The JdbcHttpClientRequestTests and HttpClientRequestTests classes both hold a static reference to a mock web server that internally uses the JDKs built-in HttpServer, which resides in a sun package that the RamUsageEstimator does not have access to. This causes builds that use a runtime of Java 8 to fail since the StaticFieldsInvariantRule is run when Java 8 is used. Relates #41526 Relates #49105	2019-11-15 13:02:21 -07:00
Lisa Cawley	bc6a9de2dd	[DOCS] Edits the get tokens API (#45312 )	2019-11-15 10:54:07 -08:00
Lee Hinman	680436dd0d	[7.x] Don't halt policy execution on policy trigger exception… (#49171 ) When triggered either by becoming master, a new cluster state, or a periodic schedule, an ILM policy execution through `maybeRunAsyncAction`, `runPolicyAfterStateChange`, or `runPeriodicStep` throwing an exception will cause the loop the terminate. This means that any indices that would have been processed after the index where the exception was thrown will not be processed by ILM. For most execution this is not a problem because the actual running of steps is protected by a try/catch that moves the index to the ERROR step in the event of a problem. If an exception occurs prior to step execution (for example, in fetching and parsing the current policy/step) however, it causes the loop termination previously mentioned. This commit wraps the invocation of the methods specified above in a try/catch block that provides better logging and does not bubble the exception up.	2019-11-15 09:22:37 -07:00
Albert Zaharovits	89b3c32b40	Audit log filter and marker (#49145 ) This adds a log marker and a marker filter for the audit log. Closes #47251	2019-11-15 08:44:09 -05:00
Christos Soulios	d9f0245b10	[7.x] Implement stats aggregation for string terms (#49097 ) Backport of #47468 to 7.x This PR adds a new metric aggregation called string_stats that operates on string terms of a document and returns the following: min_length: The length of the shortest term max_length: The length of the longest term avg_length: The average length of all terms distribution: The probability distribution of all characters appearing in all terms entropy: The total Shannon entropy value calculated for all terms This aggregation has been implemented as an analytics plugin.	2019-11-15 14:36:21 +02:00
Andrei Dan	085d08cfd1	ILM Remove obsolete testRolloverAlreadyExists (#49104 ) (#49144 ) The rollover action is now a retryable step (see #48256) so ILM will keep retrying until it succeeds as opposed to stopping and moving the execution in the ERROR step. Fixes #49073 (cherry picked from commit 3ae90898121b43032ec8f3b50514d93a86e14d0f) Signed-off-by: Andrei Dan <andrei.dan@elastic.co> # Conflicts: # x-pack/plugin/ilm/qa/multi-node/src/test/java/org/elasticsearch/xpack/ilm/TimeSeriesLifecycleActionsIT.java	2019-11-15 12:06:22 +00:00
Ioannis Kakavas	f5f0e1366a	Handle unexpected/unchecked exceptions correctly (#49080 ) (#49137 ) Ensures that methods that are called from different threads ( i.e. from the callbacks of org.apache.http.concurrent.FutureCallback ) catch `Exception` instead of only the expected checked exceptions. This resolves a bug where OpenIdConnectAuthenticator#mergeObjects would throw an IllegalStateException that was never caught causing the thread to hang and the listener to never be called. This would in turn cause Kibana requests to authenticate with OpenID Connect to timeout and fail without even logging anything relevant. This also guards against unexpected Exceptions that might be thrown by invoked library methods while performing the necessary operations in these callbacks.	2019-11-15 11:54:08 +02:00
James Baiera	6bb6adb8d3	Reuse collected cluster state in EnrichPolicyRunner (#48488 ) (#49100 ) The cluster state is obtained twice in the EnrichPolicyRunner when updating the final alias. There is a possibility for the state to be slightly different between those two calls. This PR just has the function get the cluster state once and reuse it for the life of the function call.	2019-11-14 14:14:39 -05:00
Dan Hermann	cac9fe4d86	[7.x] Validate monitoring password at parse time (#49083 )	2019-11-14 09:39:28 -06:00
Dimitris Athanasiou	be5894ed9c	[7.x][SQL] Mute JdbcConfigurationTests.testDriverConfigurationWithSSLInURL (#49085 ) (#49086 ) Relates #41557	2019-11-14 15:15:55 +02:00
Rory Hunter	c46a0e8708	Apply 2-space indent to all gradle scripts (#49071 ) Backport of #48849. Update `.editorconfig` to make the Java settings the default for all files, and then apply a 2-space indent to all `*.gradle` files. Then reformat all the files.	2019-11-14 11:01:23 +00:00
Marios Trivyzas	7c3198ba44	SQL: [Tests] Mute testReplaceChildren for Pivot (#49045 ) Temporarily "mute" the testReplaceChildren for Pivot since it leads to failing tests for some seeds, since the new child doesn't respond to a valid data type. Relates to #48900 (cherry picked from commit 6200a2207b9a4264d2f3fc976577323c7e084317)	2019-11-14 11:30:33 +01:00
Armin Braun	25e05b0013	Fix X-Pack SchedulerEngine Shutdown (#48951 ) (#49054 ) We can have a race here where `scheduleNextRun` executes concurrently to `stop` and so we run into a `RejectedExecutionException` that we don't catch and thus it fails tests. => Fixed by ignoring these so long as they coincide with a scheduler shutdown	2019-11-13 22:06:55 +01:00
Przemysław Witek	e6ad3c29fd	Do not throw exceptions resulting from persisting datafeed timing stats. (#49044 ) (#49050 )	2019-11-13 20:23:13 +01:00
Henning Andersen	66f0c8900f	Fix Transport Stopped Exception (#48930 ) (#49035 ) When a node shuts down, `TransportService` moves to stopped state and then closes connections. If a request is done in between, an exception was thrown that was not retried in replication actions. Now throw a wrapped `NodeClosedException` exception instead, which is correctly handled in replication action. Fixed other usages too. Relates #42612	2019-11-13 18:48:05 +01:00
Tanguy Leroux	e86b598813	Fix AutoFollowIT (#49025 ) This commit fixes an off-by-one bug in the AutoFollowIT test that causes failures because the leaderIndices counter is incremented during the evaluation of the leaderIndices.incrementAndGet() < 20 condition but the 20th index is not created, making the final assertion not verified. It also gives a bit more time for cluster state updates to be processed on the follower cluster. Closes #48982	2019-11-13 13:20:57 +01:00
Ioannis Kakavas	4405042900	Remove unnecessary details logged for OIDC (#48746 ) (#49031 ) This commit removes unnecessary details logged for OIDC. Co-Authored-By: Ioannis Kakavas <ikakavas@protonmail.com>	2019-11-13 13:43:56 +02:00
Yannick Welsch	2dfa0133d5	Always use primary term from primary to index docs on replica (#47583 ) Ensures that we always use the primary term established by the primary to index docs on the replica. Makes the logic around replication less brittle by always using the operation primary term on the replica that is coming from the primary.	2019-11-13 12:13:45 +01:00
Ioannis Kakavas	e0331e2a0f	Remove limitation for SAML encryption in FIPS mode (#48948 ) (#49019 ) Our documentation regarding FIPS 140 claimed that when using SAML in a JVM that is configured in FIPS approved only mode, one could not use encrypted assertions. This stemmed from a wrong understanding regarding the compliance of RSA-OAEP which is used as the key wrapping algorithm for encrypting the key with which the SAML Assertion is encrypted. However, as stated for instance in https://downloads.bouncycastle.org/fips-java/BC-FJA-SecurityPolicy-1.0.0.pdf RSA-OAEP is approved for key transport, so this limitation is not effective. This change removes the limitation from our FIPS 140 related documentation.	2019-11-13 12:10:01 +02:00
Julie Tibshirani	37fa3fb4ff	Ensure parameters are updated when merging flattened mappings. (#48971 ) (#49014 ) This PR makes the following two fixes around updating flattened fields: * Make sure that the new value for ignore_above is immediately taken into affect. Previously we recorded the new value but did not use it when parsing documents. * Allow depth_limit to be updated dynamically. It seems plausible that a user might want to tweak this setting as they encounter more data.	2019-11-12 21:50:39 -05:00
Lee Hinman	5eb37c29fe	[7.x] Re-read policy phase JSON when using ILM's move-to-step… (#49011 ) When using the move-to-step API, we should reread the phase JSON from the latest version of the ILM policy. This allows a user to move to the same step while re-reading the policy's latest version. For example, when changing rollover criteria. While manually messing around with some other things I discovered that we only reread the policy when using the retry API, not the move-to-step API. This commit changes the move-to-step API to always read the latest version of the policy.	2019-11-12 19:41:06 -07:00
Martijn van Groningen	18d5d73305	Enable spotless for enrich gradle project in 7 dot x branch. (#48976 ) Backport of #48908 The enrich project doesn't have much history as all the other gradle projects, so it makes sense to enable spotless for this gradle project.	2019-11-12 13:22:34 +01:00
Armin Braun	ea9f094e75	Significantly Lower Monitoring HttpExport Memory Footprint (#48854 ) (#48966 ) The `HttpExportBulk` exporter is using a lot more memory than it needs to by allocating buffers for serialization and IO: * Remove copying of all bytes when flushing, instead use the stream wrapper * Remove copying step turning the BAOS into a `byte[]` * This also avoids the allocation of a single huge `byte[]` and instead makes use of the internal paging logic of the `BytesStreamOutput` * Don't allocate a new BAOS for every document, just keep appending to a single BAOS	2019-11-12 08:49:40 +01:00
Jake Landis	c320b499a0	Prevent deadlock by using separate schedulers (#48697 ) (#48964 ) Currently the BulkProcessor class uses a single scheduler to schedule flushes and retries. Functionally these are very different concerns but can result in a dead lock. Specifically, the single shared scheduler can kick off a flush task, which only finishes it's task when the bulk that is being flushed finishes. If (for what ever reason), any items in that bulk fails it will (by default) schedule a retry. However, that retry will never run it's task, since the flush task is consuming the 1 and only thread available from the shared scheduler. Since the BulkProcessor is mostly client based code, the client can provide their own scheduler. As-is the scheduler would require at minimum 2 worker threads to avoid the potential deadlock. Since the number of threads is a configuration option in the scheduler, the code can not enforce this 2 worker rule until runtime. For this reason this commit splits the single task scheduler into 2 schedulers. This eliminates the potential for the flush task to block the retry task and removes this deadlock scenario. This commit also deprecates the Java APIs that presume a single scheduler, and updates any internal code to no longer use those APIs. Fixes #47599 Note - #41451 fixed the general case where a bulk fails and is retried that can result in a deadlock. This fix should address that case as well as the case when a bulk failure from the flush needs to be retried.	2019-11-11 16:31:21 -06:00
Benjamin Trent	46ab1db54f	[7.x] [ML] Add new geo_results.(actual_point\|typical_point) fields for `lat_long` results (#47050 ) (#48958 ) * [ML] Add new geo_results.(actual_point\|typical_point) fields for `lat_long` results (#47050) [ML] Add new geo_results.(actual_point\|typical_point) fields for `lat_long` results (#47050) Related PR: https://github.com/elastic/ml-cpp/pull/809 * adjusting bwc version	2019-11-11 15:43:03 -05:00
Jake Landis	909fbd0015	[7.x] Mute FullClusterRestartTest#testWatcher and 30s timeout… (#48850 ) The timeout was increased to 60s to allow this test more time to reach a yellow state. However, the test will still on occasion fail even with the 60s timeout. Related: #48381 Related: #48434 Related: #47950 Related: #40178	2019-11-11 09:38:14 -06:00
Christoph Büscher	6119f0aaa2	Fix Eclipse compilation in DataFrameDataExtractorTests (#48942 )	2019-11-11 16:17:55 +01:00
Martijn van Groningen	a1dd830cb5	Re-enabled test with longer timeout waiting for monitoring. See #48258	2019-11-11 16:07:50 +01:00
Yannick Welsch	af887be3e5	Hide orphaned tasks from follower stats (#48901 ) CCR follower stats can return information for persistent tasks that are in the process of being cleaned up. This is problematic for tests where CCR follower indices have been deleted, but their persistent follower task is only cleaned up asynchronously afterwards. If one of the following tests then accesses the follower stats, it might still get the stats for that follower task. In addition, some tests were not cleaning up their auto-follow patterns, leaving orphaned patterns behind. Other tests cleaned up their auto-follow patterns. As always the same name was used, it just depended on the test execution order whether this led to a failure or not. This commit fixes the offensive tests, and will also automatically remove auto-follow-patterns at the end of tests, like we do for many other features. Closes #48700	2019-11-08 13:56:53 +01:00
Dan Hermann	5805560a2a	Validate index name time format setting at parse time (#47911 ) (#48881 )	2019-11-07 05:24:49 -06:00

1 2 3 4 5 ...

4242 Commits