OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-19 19:35:02 +00:00

Author	SHA1	Message	Date
Armin Braun	af0f97d50a	Fix SLMSnapshotBlockingIntegTests.testSnapshotInProgress (#49533 ) (#49542 ) This test must check for state `SUCCESS` as well. `SUCESS` in `SnapshotsInProgress` means "all data nodes finished snapshotting sucessfully but master must still finalize the snapshot in the repo". `SUCESS` does not mean that the snapshot is actually fully finished in this object. You can easily reporduce the scenario in #49303 that has an in-progress snapshot in `SUCCESS` state by waiting 20s before running the busy assert loop on the snapshot status so that all steps but the blocked finalization can finish. Closes #49303	2019-11-25 13:31:45 +01:00
Dimitris Athanasiou	c149c64dc4	[7.x][ML] Apply source query on data frame analytics memory estimation (#49517 ) (#49532 ) Closes #49454 Backport of #49517	2019-11-25 12:51:57 +02:00
Hendrik Muhs	5256756879	[Transform] add debug log for configuration index (#49484 ) add debug log for transform creation and disallow partial results for retrieval	2019-11-25 09:49:17 +01:00
debadair	2ec047db04	[DOCS] Rename auditing topic. Closes #49012 (#49013 ) * [DOCS] Rename auditing topic. Closes #49012 * Fixed file name, fixed settings link. * Add link to settings	2019-11-22 14:16:58 -08:00
Dimitris Athanasiou	8eaee7cbdc	[7.x][ML] Explain data frame analytics API (#49455 ) (#49504 ) This commit replaces the _estimate_memory_usage API with a new API, the _explain API. The API consolidates information that is useful before creating a data frame analytics job. It includes: - memory estimation - field selection explanation Memory estimation is moved here from what was previously calculated in the _estimate_memory_usage API. Field selection is a new feature that explains to the user whether each available field was selected to be included or not in the analysis. In the case it was not included, it also explains the reason why. Backport of #49455	2019-11-22 22:06:10 +02:00
Jason Tedor	71bcfbf1e3	Replace required pipeline with final pipeline (#49470 ) This commit enhances the required pipeline functionality by changing it so that default/request pipelines can also be executed, but the required pipeline is always executed last. This gives users the flexibility to execute their own indexing pipelines, but also ensure that any required pipelines are also executed. Since such pipelines are executed last, we change the name of required pipelines to final pipelines.	2019-11-22 14:37:36 -05:00
Marios Trivyzas	0c4491964b	SQL: Fix issue with folding of CASE/IIF (#49449 ) Add extra checks to prevent ConstantFolding rule to try to fold the CASE/IIF functions early before the SimplifyCase rule gets applied. Fixes: #49387 (cherry picked from commit f35c9725350e35985d8dd3001870084e1784a5ca)	2019-11-22 18:29:49 +01:00
Benjamin Trent	276b6c67f4	[ML][Inference] Fixing pre-processor value handling and size estimate (#49270 ) (#49489 ) * [ML][Inference] Fixing pre-processor value handling and size estimate * fixing npe	2019-11-22 08:14:33 -05:00
Jim Ferenczi	ed4eecc00e	Pre-sort shards based on the max/min value of the primary sort field (#49092 ) This change automatically pre-sort search shards on search requests that use a primary sort based on the value of a field. When possible, the can_match phase will extract the min/max (depending on the provided sort order) values of each shard and use it to pre-sort the shards prior to running the subsequent phases. This feature can be useful to ensure that shards that contain recent data are executed first so that intermediate merge have more chance to contain contiguous data (think of date_histogram for instance) but it could also be used in a follow up to early terminate sorted top-hits queries that don't require the total hit count. The latter could significantly speed up the retrieval of the most/least recent documents from time-based indices. Relates #49091	2019-11-22 11:02:12 +01:00
Hendrik Muhs	1fbb248cb7	reenable warning checks in pivot tests (#49436 )	2019-11-22 08:50:10 +01:00
Tim Vernum	2e5f2dd1e1	Deprecate misconfigured SSL server config (#49280 ) This commit adds a deprecation warning when starting a node where either of the server contexts (xpack.security.transport.ssl and xpack.security.http.ssl) meet either of these conditions: 1. The server lacks a certificate/key pair (i.e. neither ssl.keystore.path not ssl.certificate are configured) 2. The server has some ssl configuration, but ssl.enabled is not specified. This new validation does not care whether ssl.enabled is true or false (though other validation might), it simply makes it an error to configure server SSL without being explicit about whether to enable that configuration. Backport of: #45892	2019-11-22 12:14:55 +11:00
Benjamin Trent	a7477ad7c3	[7.x] [ML][Inference] compressing model definition and lazy parsing (#49269 ) (#49446 ) * [ML][Inference] compressing model definition and lazy parsing (#49269) * [ML][Inference] compressing model definition and lazy parsing * addressing PR comments * adding commons io * implementing simplified bounded stream * adjusting for type inclusion	2019-11-21 15:32:32 -05:00
Benjamin Trent	d9835f7fb4	[ML] Fix r_squared eval when variance is 0 (#49439 ) (#49445 )	2019-11-21 11:22:16 -05:00
Benjamin Trent	d41b2e3f38	[ML][Inference] allowing per-model licensing (#49398 ) (#49435 ) * [ML][Inference] allowing per-model licensing * changing to internal action + removing pre-mature opt	2019-11-21 09:46:34 -05:00
Przemysław Witek	c7ac2011eb	[7.x] Implement accuracy metric for multiclass classification (#47772 ) (#49430 )	2019-11-21 15:01:18 +01:00
Martijn van Groningen	d59ea64ccd	Monitoring should wait with collecting data when cluster service is started. (#49426 ) Backport of #48277 Otherwise integration tests may fail if the monitoring interval is low: ``` [2019-10-21T09:57:25,527][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [integTest-0] fatal error in thread [elasticsearch[integTest-0][generic][T#4]], exiting java.lang.AssertionError: initial cluster state not set yet at org.elasticsearch.cluster.service.ClusterApplierService.state(ClusterApplierService.java:208) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT] at org.elasticsearch.cluster.service.ClusterService.state(ClusterService.java:125) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT] at org.elasticsearch.xpack.monitoring.MonitoringService$MonitoringExecution$1.doRun(MonitoringService.java:231) ~[?:?] at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) ~[?:?] at java.util.concurrent.FutureTask.run(FutureTask.java:264) ~[?:?] at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:703) ~[elasticsearch-7.6.0-SNAPSHOT.jar:7.6.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) ~[?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ~[?:?] at java.lang.Thread.run(Thread.java:835) [?:?] ``` I ran into this when lowering the monitoring interval when investigating enrich monitoring test: #48258	2019-11-21 14:22:41 +01:00
Hendrik Muhs	c3e4405ddf	[7.x][Transform] Transform fix force stop race condition (#49249 ) (#49420 ) fix force stopping transform if indexer state hasn't been written and/or is set to STOPPED. In certain situations the transform could not be stopped, which means the task could not be removed. Introduces improved abstraction in order to better test state handling in future.	2019-11-21 13:52:14 +01:00
Andrei Dan	010c3de47e	Slm set operation mode to RUNNING on first run (#49236 ) (#49425 ) * SLM set the operation mode to RUNNING on first run Set the SLM operation mode to RUNNING when setting the first SLM lifecycle policy. Historically, SLM was not decoupled from ILM but now they are independent components. Setting the SLM operation mode to what the ILM running mode was when we set the first SLM lifecycle policy was a remain from those times. * SLM update package info * SLM suppress unusued warning * SLM use logger for the correct class * SLM Add integration test for operation mode * Use ESSingleNodeTestCase instead of ESIntegTestCase (cherry picked from commit 4ad3d93f89d03bf9a25685a990d1a439f33ce0e6) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-11-21 11:41:32 +00:00
István Zoltán Szabó	5b10fd301e	[DOCS] Fixes endpoint schema in PUT app privileges API docs. (#49390 )	2019-11-21 09:52:44 +01:00
Lisa Cawley	61c54fd617	[DOCS] Qualifies Watcher transforms (#47482 )	2019-11-20 16:44:18 -08:00
Nhat Nguyen	fec22130c2	Improve error message when pausing index (#48915 ) Throw an appropriate error message when the follower index is not found or is a regular index.	2019-11-20 15:58:44 -05:00
Hendrik Muhs	06c2689802	rename data frame tests to transform tests (#49361 ) rename files and tests in rolling upgrade tests to transform	2019-11-20 18:51:11 +01:00
Bogdan Pintea	8c2ab8bb72	SQL:Docs: add the PIVOT clause to SELECT section (#49129 ) The PR adds the documentation on the PIVOT clause. (cherry picked from commit a55b36065e6496c44b6e3191296931d477a8e5f5)	2019-11-20 18:21:06 +01:00
David Roberts	20558cf61c	[ML] Fix simultaneous stop and force stop datafeed (#49367 ) If a datafeed is stopped normally and force stopped at the same time then it is possible that the force stop removes the persistent task while the normal stop is performing actions. Currently this causes the normal stop to error, but since stopping a stopped datafeed is not an error this doesn't make sense. Instead the force stop should just take precedence. This is a followup to #49191 and should really have been included in the changes in that PR.	2019-11-20 12:52:47 +00:00
Mayya Sharipova	e3da60c23d	Increase the number of vector dims to 2048 (#46895 )	2019-11-20 07:47:33 -05:00
Przemysław Witek	9c0ec7ce23	[7.x] Make AnalyticsProcessManager class more robust (#49282 ) (#49356 )	2019-11-20 10:08:16 +01:00
Dimitris Athanasiou	4d6e037e90	[7.x][ML] Extract creation of DFA field extractor into a factory (#49315 ) (#49329 ) This commit moves the async calls required to retrieve the components that make up `ExtractedFieldsExtractor` out of `DataFrameDataExtractorFactory` and into a dedicated `ExtractorFieldsExtractorFactory` class. A few more refactorings are performed: - The detector no longer needs the results field. Instead, it knows whether to use it or not based on whether the task is restarting. - We pass more accurately whether the task is restarting or not. - The validation of whether fields that have a cardinality limit are valid is now performed in the detector after retrieving the respective cardinalities. Backport of #49315	2019-11-20 10:02:42 +02:00
Lisa Cawley	2b9fb7ebe2	[DOCS] Merges security overview pages (#49342 )	2019-11-19 16:19:02 -08:00
Przemysław Witek	42bb8ae525	[7.x] Extract indexData method out of RegressionIT tests (#49306 ) (#49313 )	2019-11-19 22:47:12 +01:00
Mark Tozzi	17358b5af7	(refactor) Extract Empty/Script/Missing ValuesSource behavior to an interface (#48320 ) (#49330 ) This is a pure code rearrangement refactor. Logic for what specific ValuesSource instance to use for a given type (e.g. script or field) moved out of ValuesSourceConfig and into CoreValuesSourceType (previously just ValueSourceType; we extract an interface for future extensibility). ValueSourceConfig still selects which case to use, and then the ValuesSourceType instance knows how to construct the ValuesSource for that case.	2019-11-19 16:44:29 -05:00
Lisa Cawley	75f1f612c2	[DOCS] Merges duplicate pages for Active Directory realms (#49205 )	2019-11-19 13:18:01 -08:00
Jay Modi	eed4cd25eb	ThreadPool and ThreadContext are not closeable (#43249 ) (#49273 ) This commit changes the ThreadContext to just use a regular ThreadLocal over the lucene CloseableThreadLocal. The CloseableThreadLocal solves issues with ThreadLocals that are no longer needed during runtime but in the case of the ThreadContext, we need it for the runtime of the node and it is typically not closed until the node closes, so we miss out on the benefits that this class provides. Additionally by removing the close logic, we simplify code in other places that deal with exceptions and tracking to see if it happens when the node is closing. Closes #42577	2019-11-19 13:15:16 -07:00
Lisa Cawley	c4c8a7a43c	[DOCS] Merges duplicate pages for PKI realms (#49206 )	2019-11-19 10:51:09 -08:00
Lisa Cawley	2f5acae4a9	[DOCS] Groups pages related to encrypting communications (#49324 )	2019-11-19 10:10:39 -08:00
Lisa Cawley	62bbe419d3	[DOCS] Removes Beats security page (#49276 )	2019-11-19 09:15:30 -08:00
Andrei Dan	19780e20ba	Handle failure to retrieve ILM policy step better (#49193 ) (#49316 ) This commit wraps the calls to retrieve the current step in a try/catch so that the exception does not bubble up. Instead, step info is added containing the exception to the existing step. Semi-related to #49128 (cherry picked from commit 72530f8a7f40ae1fca3704effb38cf92daf29057) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-11-19 17:14:46 +00:00
Armin Braun	0acba44a2e	Make Repository.getRepositoryData an Async API (#49299 ) (#49312 ) This API call in most implementations is fairly IO heavy and slow so it is more natural to be async in the first place. Concretely though, this change is a prerequisite of #49060 since determining the repository generation from the cluster state introduces situations where this call would have to wait for other operations to finish. Doing so in a blocking manner would break `SnapshotResiliencyTests` and waste a thread. Also, this sets up the possibility to in the future make use of async IO where provided by the underlying Repository implementation. In a follow-up `SnapshotsService#getRepositoryData` will be made async as well (did not do it here, since it's another huge change to do so). Note: This change for now does not alter the threading behaviour in any way (since `Repository#getRepositoryData` isn't forking) and is purely mechanical.	2019-11-19 16:49:12 +01:00
Marios Trivyzas	fd1bb4a33a	SQL: Fix issue with mins & hours for DATEDIFF (#49252 ) Previously, DATEDIFF for minutes and hours was doing a rounding calculation using all the time fields (secs, msecs/micros/nanos). Instead it should first truncate the 2 dates to the respective field (mins or hours) zeroing out all the more detailed time fields and then make the subtraction. (cherry picked from commit 124cd18e20429e19d52fd8dc383827ea5132d428)	2019-11-19 14:25:28 +01:00
Benjamin Trent	19602fd573	[ML][Inference] changing setting to be memorySizeSettting (#49259 ) (#49302 )	2019-11-19 07:56:40 -05:00
Przemysław Witek	38aec2e298	Relax assertions related to datafeed timing stats in .yml test (#49285 ) (#49291 )	2019-11-19 12:50:14 +01:00
David Roberts	a5204c1c80	[ML] Fixes for stop datafeed edge cases (#49284 ) The following edge cases were fixed: 1. A request to force-stop a stopping datafeed is no longer ignored. Force-stop is an important recovery mechanism if normal stop doesn't work for some reason, and needs to operate on a datafeed in any state other than stopped. 2. If the node that a datafeed is running on is removed from the cluster during a normal stop then the stop request is retried (and will likely succeed on this retry by simply cancelling the persistent task for the affected datafeed). 3. If there are multiple simultaneous force-stop requests for the same datafeed we no longer fail the one that is processed second. The previous behaviour was wrong as stopping a stopped datafeed is not an error, so stopping a datafeed twice simultaneously should not be either. Backport of #49191	2019-11-19 10:51:46 +00:00
Lisa Cawley	abd4a70b10	[DOCS] Merges duplicate pages for Kerberos realms (#49207 )	2019-11-18 15:23:06 -08:00
Lisa Cawley	b4f82c9cdb	[DOCS] Merges duplicate pages for LDAP realms (#49203 )	2019-11-18 14:09:24 -08:00
Julie Tibshirani	a0ee6c8f7e	Add telemetry for flattened fields. (#48972 ) (#49125 ) Currently we just record the number of flattened fields defined in the mappings.	2019-11-18 12:29:42 -08:00
Lisa Cawley	b0054eecd6	[DOCS] Merges duplicate pages for file realms (#49200 )	2019-11-18 12:02:18 -08:00
Benjamin Trent	eefe7688ce	[7.x][ML] ML Model Inference Ingest Processor (#49052 ) (#49257 ) * [ML] ML Model Inference Ingest Processor (#49052) * [ML][Inference] adds lazy model loader and inference (#47410) This adds a couple of things: - A model loader service that is accessible via transport calls. This service will load in models and cache them. They will stay loaded until a processor no longer references them - A Model class and its first sub-class LocalModel. Used to cache model information and run inference. - Transport action and handler for requests to infer against a local model Related Feature PRs: * [ML][Inference] Adjust inference configuration option API (#47812) * [ML][Inference] adds logistic_regression output aggregator (#48075) * [ML][Inference] Adding read/del trained models (#47882) * [ML][Inference] Adding inference ingest processor (#47859) * [ML][Inference] fixing classification inference for ensemble (#48463) * [ML][Inference] Adding model memory estimations (#48323) * [ML][Inference] adding more options to inference processor (#48545) * [ML][Inference] handle string values better in feature extraction (#48584) * [ML][Inference] Adding _stats endpoint for inference (#48492) * [ML][Inference] add inference processors and trained models to usage (#47869) * [ML][Inference] add new flag for optionally including model definition (#48718) * [ML][Inference] adding license checks (#49056) * [ML][Inference] Adding memory and compute estimates to inference (#48955) * fixing version of indexed docs for model inference	2019-11-18 13:19:17 -05:00
Lisa Cawley	48f53efd9a	[DOCS] Merges duplicate pages for SAML realms (#49209 )	2019-11-18 10:09:29 -08:00
Armin Braun	25cc8e3663	Fix RepoCleanup not Removed on Master-Failover (#49217 ) (#49239 ) The logic for `cleanupInProgress()` was backwards everywhere (method itself and all but one user). Also, we weren't checking it when removing a repository. This lead to a bug (in the one spot that didn't use the method backwards) that prevented the cleanup cluster state entry from ever being removed from the cluster state if master failed over during the cleanup process. This change corrects the backwards logic, adds a test that makes sure the cleanup is always removed and adds a check that prevents repository removal during cleanup to the repositories service. Also, the failure handling logic in the cleanup action was broken. Repeated invocation would lead to the cleanup being removed from the cluster state even if it was in progress. Fixed by adding a flag that indicates whether or not any removal of the cleanup task from the cluster state must be executed. Sorry for mixing this in here, but I had to fix it in the same PR, as the first test (for master-failover) otherwise would often just delete the blocked cleanup action as a result of a transport master action retry.	2019-11-18 16:44:09 +01:00
Przemysław Witek	5f9965e4b8	Lower minimum model memory limit value from 1MB to 1kB. (#49227 ) (#49242 )	2019-11-18 14:58:20 +01:00
Hendrik Muhs	ca912624ec	[Transform] improve error handling of script errors (#48887 ) improve error handling for script errors, treating it as irrecoverable errors which puts the task immediately into failed state, also improves the error extraction to properly report the script error. fixes #48467	2019-11-18 10:24:39 +01:00

... 3 4 5 6 7 ...

4478 Commits