OpenSearch

Commit Graph

Author	SHA1	Message	Date
Jake Landis	a370d5eead	[7.x] Ensure Joni warning are logged at debug (#57302 ) (#57897 ) When Joni, the regex engine that powers grok emits a warning it does so by default to System.err. System.err logs are all bucketed together in the server log at WARN level. When Joni emits a warning, it can be extremely verbose, logging a message for each execution again that pattern. For ingest node that means for every document that is run that through Grok. Fortunately, Joni provides a call back hook to push these warnings to a custom location. This commit implements Joni's callback hook to push the Joni warning to the Elasticsearch server logger (logger.org.elasticsearch.ingest.common.GrokProcessor) at debug level. Generally these warning indicate a possible issue with the regular expression and upon creation of the Grok processor will do a "test run" of the expression and log the result (if any) at WARN level. This WARN level log should only occur on pipeline creation which is a much lower frequency then every document. Additionally, the documentation is updated with instructions for how to set the logger to debug level.	2020-06-09 17:06:29 -05:00
Benjamin Trent	d5522c2747	[ML] add new circuit breaker for inference model caching (#57731 ) (#57830 ) This adds new plugin level circuit breaker for the ML plugin. `model_inference` is the circuit breaker qualified name. Right now it simply adds to the breaker when the model is loaded (and possibly breaking) and removing from the breaker when the model is unloaded.	2020-06-08 16:02:48 -04:00
Mayya Sharipova	70e63a365a	Refactor how to determine if a field is metafield (#57378 ) (#57771 ) Before to determine if a field is meta-field, a static method of MapperService isMetadataField was used. This method was using an outdated static list of meta-fields. This PR instead changes this method to the instance method that is also aware of meta-fields in all registered plugins. Related #38373, #41656 Closes #24422	2020-06-08 09:16:18 -04:00
David Kyle	08d1286de7	[7.x] Delete expired data by job (#57337 ) (#57796 ) Deleting expired data can take a long time leading to timeouts if there are many jobs. Often the problem is due to a few large jobs which prevent the regular maintenance of the remaining jobs. This change adds a job_id parameter to the delete expired data endpoint to help clean up those problematic jobs.	2020-06-08 13:00:23 +01:00
David Roberts	1d64d55a86	[7.x][ML] Add per-partition categorization option (#57723 ) This PR adds the initial Java side changes to enable use of the per-partition categorization functionality added in elastic/ml-cpp#1293. There will be a followup change to complete the work, as there cannot be any end-to-end integration tests until elastic/ml-cpp#1293 is merged, and also elastic/ml-cpp#1293 does not implement some of the more peripheral functionality, like stop_on_warn and per-partition stats documents. The changes so far cover REST APIs, results object formats, HLRC and docs. Backport of #57683	2020-06-06 08:15:17 +01:00
Benjamin Trent	9666a895f7	[ML] inference performance optimizations and refactor (#57674 ) (#57753 ) This is a major refactor of the underlying inference logic. The main refactor is now we are separating the model configuration and the inference interfaces. This has the following benefits: - we can store extra things with the model that are not necessary for inference (i.e. treenode split information gain) - we can optimize inference separate from model serialization and storage. - The user is oblivious to the optimizations (other than seeing the benefits). A major part of this commit is removing all inference related methods from the trained model configurations (ensemble, tree, etc.) and moving them to a new class. This new class satisfies a new interface that is ONLY for inference. The optimizations applied currently are: - feature maps are flattened once - feature extraction only happens once at the highest level (improves inference + feature importance through put) - Only storing what we need for inference + feature importance on heap	2020-06-05 14:20:58 -04:00
Dimitris Athanasiou	f49a14ce6f	[7.x][ML] Fix race condition when force stopping DF analytics job (#57680 ) (#57717 ) When we force delete a DF analytics job, we currently first force stop it and then we proceed with deleting the job config. This may result in logging errors if the job config is deleted before it is retrieved while the job is starting. Instead of force stopping the job, it would make more sense to try to stop the job gracefully first. So we now try that out first. If normal stop fails, then we resort to force stopping the job to ensure we can go through with the delete. In addition, this commit introduces `timeout` for the delete action and makes use of it in the child requests. Backport of #57680	2020-06-05 17:50:01 +03:00
William Brafford	dfb6def3da	Revert "Restore xpack.ilm.enabled and xpack.slm.enabled settings (#57383 )" This reverts commit `7a67fb2d04`.	2020-06-04 16:25:05 -04:00
William Brafford	7a67fb2d04	Restore xpack.ilm.enabled and xpack.slm.enabled settings (#57383 ) In #55592 and #55416, we deprecated the settings for enabling and disabling basic license features and turned those settings into no-ops. Since doing so, we've had feedback that this change may not give users enough time to cleanly switch from non-ILM index management tools to ILM. If two index managers operate simultaneously, results could be strange and difficult to reconstruct. We don't know of any cases where SLM will cause a problem, but we are restoring that setting as well, to be on the safe side. This PR is not a strict commit reversion. First, we are keeping the new xpack.watcher.use_ilm_index_management setting, introduced when xpack.ilm.enabled was made a no-op, so that users can begin migrating to using it. Second, the SLM setting was modified in the same commit as a group of other settings, so I have taken just the changes relating to SLM.	2020-06-04 13:38:22 -04:00
Przemysław Witek	6b5f49d097	[7.x] Introduce ModelPlotConfig. annotations_enabled setting (#57539 ) (#57641 )	2020-06-04 15:15:35 +02:00
Benjamin Trent	ea9b8b9d41	[ML] fix setting forecasts to failed method (#57654 ) (#57656 )	2020-06-04 08:54:46 -04:00
Przemysław Witek	ea6cfb7c3d	[7.x] Make Annotation a result type (#56342 ) (#57508 )	2020-06-02 11:56:41 +02:00
Przemysław Witek	ceb4b29b98	Introduce Annotation.event field (#57144 ) (#57453 )	2020-06-01 20:42:25 +02:00
David Kyle	064093c4d4	Fix compilation after backport of #57278	2020-06-01 12:03:13 +01:00
Przemysław Witek	72ad9a4548	[7.x] Make AnnotationPersister use bulk requests instead of indexing individual documents (#57278 ) (#57354 )	2020-06-01 12:05:09 +02:00
Benjamin Trent	34f1e0b6bb	[7.x] [ML] mark forecasts for force closed/failed jobs as failed (#57143 ) (#57374 ) * [ML] mark forecasts for force closed/failed jobs as failed (#57143) forecasts that are still running should be marked as failed/finished in the following scenarios: - Job is force closed - Job is re-assigned to another node. Forecasts are not "resilient". Their execution does not continue after a node failure. Consequently, forecasts marked as STARTED or SCHEDULED should be flagged as failed. These forecasts can then be deleted. Additionally, force closing a job kills the native task directly. This means that if a forecast was running, it is not allowed to complete and could still have the status of `STARTED` in the index. relates to https://github.com/elastic/elasticsearch/issues/56419	2020-05-29 14:48:10 -04:00
Benjamin Trent	35d5126cea	[7.x] [ML] adds new for_export flag to GET _ml/inference API (#57351 ) (#57368 ) * [ML] adds new for_export flag to GET _ml/inference API (#57351) Adds a new boolean flag, `for_export` to the `GET _ml/inference/<model_id>` API. This flag is useful for moving models between clusters.	2020-05-29 14:01:08 -04:00
Benjamin Trent	c8374dc9f3	[ML] add max_model_memory parameter to forecast request (#57254 ) (#57355 ) This adds a max_model_memory setting to forecast requests. This setting can take a string value that is formatted according to byte sizes (i.e. "50mb", "150mb"). The default value is `20mb`. There is a HARD limit at `500mb` which will throw an error if used. If the limit is larger than 40% the anomaly job's configured model limit, the forecast limit is reduced to be strictly lower than that value. This reduction is logged and audited. related native change: https://github.com/elastic/ml-cpp/pull/1238 closes: https://github.com/elastic/elasticsearch/issues/56420	2020-05-29 11:16:08 -04:00
Dimitris Athanasiou	322f953060	[7.x][ML] Anomaly detection jobs should allow missing values for geo fields (#57300 ) (#57338 ) Allows geo fields (`geo_point`, `geo_shape`) to have missing values. Fixes a bug where such missing values would result in an error. Closes #57299 Backport of #57300	2020-05-29 13:06:16 +03:00
Benjamin Trent	24d605e41e	[ML] fixing GET _ml/inference so size param is respected (#57303 ) (#57308 ) `size` was previously ignored when grabbing full trained model configs. closes https://github.com/elastic/elasticsearch/issues/57298	2020-05-28 15:45:26 -04:00
David Roberts	d139a79ef6	[7.x][ML] Fix monitoring if orphaned anomaly detector persistent tasks exist (#57240 ) Since #51888 the ML job stats endpoint has returned entries for jobs that have a persistent task but not job config. Such orphaned tasks caused monitoring to fail. This change ignores any such corrupt jobs for monitoring purposes. Backport of #57235	2020-05-27 22:59:11 +01:00
Benjamin Trent	decc6277f9	[ML] allow unran/incomplete forecasts to be deleted for stopped/failed jobs (#57152 ) (#57172 ) If a job is NOT opened, forecasts should be able to be deleted, no matter their state. This also fixes a bug with expanding forecast IDs. We should check for wildcard `*` and `_all` when expanding the ids closes https://github.com/elastic/elasticsearch/issues/56419	2020-05-26 15:44:22 -04:00
David Kyle	571477d0ad	[7.x] Fix delete_expired_data/nightly maintenance when many model snapshots need deleting (#57041 ) (#57136 ) Fix delete_expired_data/nightly maintenance when many model snapshots need deleting (#57041) The queries performed by the expired data removers pull back entire documents when only a few fields are required. For ModelSnapshots in particular this is a problem as they contain quantiles which may be 100s of KB and the search size is set to 10,000. This change makes the search more efficient by only requesting the fields needed to work out which expired data should be deleted.	2020-05-26 10:56:42 +01:00
Przemysław Witek	ea2012778e	Mute failing test (#57112 ) (#57113 )	2020-05-25 14:06:29 +02:00
Benjamin Trent	f00dfb2d5f	[ML] adds WKT support in filestructurefinder (#57014 ) (#57032 ) Field mapping detection is done via grok patterns. This commit adds well-known text (WKT) formatted geometry detection. If everything is a `POINT`, then a `geo_point` mapping is preferred. Otherwise, if all the fields are WKT geometries a `geo_shape` mapping is preferred. This does NOT detect other types of formatted geometries (geohash, comma delimited points, etc.) closes https://github.com/elastic/elasticsearch/issues/56967	2020-05-21 08:22:51 -04:00
Alan Woodward	18bfbeda29	Move merge compatibility logic from MappedFieldType to FieldMapper (#56915 ) Merging logic is currently split between FieldMapper, with its merge() method, and MappedFieldType, which checks for merging compatibility. The compatibility checks are called from a third class, MappingMergeValidator. This makes it difficult to reason about what is or is not compatible in updates, and even what is in fact updateable - we have a number of tests that check compatibility on changes in mapping configuration that are not in fact possible. This commit refactors the compatibility logic so that it all sits on FieldMapper, and makes it called at merge time. It adds a new FieldMapperTestCase base class that FieldMapper tests can extend, and moves the compatibility testing machinery from FieldTypeTestCase to here. Relates to #56814	2020-05-20 09:43:13 +01:00
Benjamin Trent	297f864884	[ML] relax throttling on expired data cleanup (#56711 ) (#56895 ) Throttling nightly cleanup as much as we do has been over cautious. Night cleanup should be more lenient in its throttling. We still keep the same batch size, but now the requests per second scale with the number of data nodes. If we have more than 5 data nodes, we don't throttle at all. Additionally, the API now has `requests_per_second` and `timeout` set. So users calling the API directly can set the throttling. This commit also adds a new setting `xpack.ml.nightly_maintenance_requests_per_second`. This will allow users to adjust throttling of the nightly maintenance.	2020-05-18 08:46:42 -04:00
Dimitris Athanasiou	54d3cc74ec	[7.x][ML] Ensure class is represented when its cardinality is low (#56783 ) (#56829 ) In DF analytics classification, it is possible to use no samples of a class if its cardinality is too low. This commit fixes this by ensuring the target sample count can never be zero. Backport of #56783	2020-05-15 20:52:06 +03:00
David Roberts	270a23e422	[TEST] Fix log tail mocking in native process unit tests (#56804 ) This is a followup to #56632. Tests that had to be changed to mock the C++ log handler more accurately need to be more careful about when that stream ends, as ending of that stream is used to detect crashes in the production system. Fixes #56796	2020-05-15 12:46:37 +01:00
Dimitris Athanasiou	ac5902624c	[7.x][ML] Improve error upon DF analytics mappings conflict (#56700 ) (#56776 ) Adds the conflicting types and an example of an index which specifies them in order to make it easier for the user to understand the conflict. Backport of #56700	2020-05-14 19:16:10 +03:00
David Roberts	3051c37f92	[ML] Tail the C++ logging pipe before connecting other pipes (#56701 ) Prior to this change the named pipes that connect the ML C++ processes to the Elasticsearch JVM were all opened before any of them were read from or written to. This created a problem, where if the C++ process logged more messages between opening the log pipe and opening the last pipe to be connected than there was space for in the named pipe's buffer then the C++ process would block. This would mean it never got as far as opening the last named pipe, so the JVM would never get as far as reading from the log pipe, hence a deadlock. This change alters the connection order so that the JVM starts reading from the logging pipe immediately after opening it so that if the C++ process logs messages while opening the other named pipes they are captured in a timely manner and there is no danger of a deadlock. Backport of #56632	2020-05-14 07:10:30 +01:00
Armin Braun	0a879b95d1	Save Bounds Checks in BytesReference (#56577 ) (#56621 ) Two spots that allow for some optimization: * We are often creating a composite reference of just a single item in the transport layer => special cased via static constructor to make sure we never do that * Also removed the pointless case of an empty composite bytes ref * `ByteBufferReference` is practically always created from a heap buffer these days so there is no point of dealing with all the bounds checks and extra references to sliced buffers from that and we can just use the underlying array directly	2020-05-12 20:33:45 +02:00
Ryan Ernst	902fc546bd	Migrate remaining ESIntegTestCases to internalClusterTest (#56479 ) (#56563 ) This commit migrates the ESIntegTestCase tests in x-pack to the internalClusterTest source set.	2020-05-11 21:06:04 -07:00
zhenxianyimeng	8e96e5c936	Use CollectionUtils.isEmpty where appropriate (#55910 ) This commit uses the isEmpty utility method for arrays in place of null and greater than zero checks.	2020-05-11 09:55:57 -07:00
Dimitris Athanasiou	44ffa388ac	[7.x][ML] Use non-zero timeout when force stopping DF analytics (#56423 ) (#56428 ) We have been using a zero timeout in the case that DF analytics is stopped. This may cause a timeout when we cancel, for example, the reindex task. This commit fixes this by using the default timeout instead. Backport of #56423	2020-05-08 21:12:11 +03:00
David Roberts	9a3924a641	[ML] Adjust list of platforms that have ML native code (#56426 ) Native code is now available for linux-aarch64. Note that it is _not_ currently supported!	2020-05-08 16:22:45 +01:00
Dimitris Athanasiou	c117ae7a6e	[7.x][ML] Force stopping stopped DF analytics should succeed (#56421 ) (#56424 ) Force stopping a DF analytics job whose config exists and that is stopped should succeed. This was broken by #56360. Closes #56414 Backport of #56421	2020-05-08 18:04:24 +03:00
Dimitris Athanasiou	60b1c67409	[7.x][ML] Allow stopping DF analytics whose config is missing (#56360 ) (#56408 ) It is possible that the config document for a data frame analytics job is deleted from the config index. If that is the case the user is unable to stop a running job because we attempt to retrieve the config and that will throw. This commit changes that. When the request is forced, we do not expand the requested ids based on the existing configs but from the list of running tasks instead. Backport of #56360	2020-05-08 13:54:44 +03:00
Dimitris Athanasiou	d064eda2b0	[7.x][ML] Ensure phase progress may only increase (#56339 ) (#56357 ) Due to multi-threading it is possible that phase progress updates written from the c++ process arrive reordered. We can address this by ensuring that progress may only increase. Closes #56282 Backport of #56339	2020-05-07 19:46:58 +03:00
Przemysław Witek	0cd0ab276e	Introduce Annotation.Builder class and use it to create instances of Annotation class (#56276 ) (#56286 )	2020-05-06 20:47:03 +02:00
Dimitris Athanasiou	011e995165	[7.x][ML] Unmute ClssificationIT.testDependentVariableCardinalityTooHighButWithQueryMakesItWithinRange (#56268 ) (#56287 ) Closes #56240	2020-05-06 18:20:46 +03:00
Julie Tibshirani	49de092b38	Mute RegressionIT.testTwoJobsWithSameRandomizeSeedUseSameTrainingSet.	2020-05-05 16:25:36 -07:00
Julie Tibshirani	63062ec7bd	Mute ClassificationIT.testDependentVariableCardinalityTooHighButWithQueryMakesItWithinRange.	2020-05-05 13:48:35 -07:00
Dan Hermann	6674f14fb3	[7.x] Get index includes parent data stream for backing indices (#56238 )	2020-05-05 15:43:42 -05:00
Benjamin Trent	e1c5ca421e	[7.x] [ML] lay ground work for handling >1 result indices (#55892 ) (#56192 ) * [ML] lay ground work for handling >1 result indices (#55892) This commit removes all but one reference to `getInitialResultsIndexName`. This is to support more than one result index for a single job.	2020-05-05 15:54:08 -04:00
William Brafford	3499fa917c	Deprecated xpack "enable" settings should be no-ops (#55416 ) (#56167 ) The following settings are now no-ops: * xpack.flattened.enabled * xpack.logstash.enabled * xpack.rollup.enabled * xpack.slm.enabled * xpack.sql.enabled * xpack.transform.enabled * xpack.vectors.enabled Since these settings no longer need to be checked, we can remove settings parameters from a number of constructors and methods, and do so in this commit. We also update documentation to remove references to these settings.	2020-05-05 10:40:49 -04:00
David Roberts	7aa0daaabd	[7.x][ML] More advanced model snapshot retention options (#56194 ) This PR implements the following changes to make ML model snapshot retention more flexible in advance of adding a UI for the feature in an upcoming release. - The default for `model_snapshot_retention_days` for new jobs is now 10 instead of 1 - There is a new job setting, `daily_model_snapshot_retention_after_days`, that defaults to 1 for new jobs and `model_snapshot_retention_days` for pre-7.8 jobs - For days that are older than `model_snapshot_retention_days`, all model snapshots are deleted as before - For days that are in between `daily_model_snapshot_retention_after_days` and `model_snapshot_retention_days` all but the first model snapshot for that day are deleted - The `retain` setting of model snapshots is still respected to allow selected model snapshots to be retained indefinitely Backport of #56125	2020-05-05 14:31:58 +01:00
Dimitris Athanasiou	75dadb7a6d	[7.x][ML] Add loss_function to regression (#56118 ) (#56187 ) Adds parameters `loss_function` and `loss_function_parameter` to regression. Backport of #56118	2020-05-05 14:59:51 +03:00
Dimitris Athanasiou	6061aa3db4	[7.x][ML] Fix race condition updating reindexing progress (#56135 ) (#56146 ) In #55763 I thought I could remove the flag that marks reindexing was finished on a data frame analytics task. However, that exposed a race condition. It is possible that between updating reindexing progress to 100 because we have called `DataFrameAnalyticsManager.startAnalytics()` and a call to the _stats API which updates reindexing progress via the method `DataFrameAnalyticsTask.updateReindexTaskProgress()` we end up overwriting the 100 with a lower progress value. This commit fixes this issue by bringing back the help of a `isReindexingFinished` flag as it was prior to #55763. Closes #56128 Backport of #56135	2020-05-05 10:48:42 +03:00
Martijn van Groningen	2ac32db607	Move includeDataStream flag from IndicesOptions to IndexNameExpressionResolver.Context (#56151 ) Backport of #56034. Move includeDataStream flag from an IndicesOptions to IndexNameExpressionResolver.Context as a dedicated field that callers to IndexNameExpressionResolver can set. Also alter indices stats api to support data streams. The rollover api uses this api and otherwise rolling over data stream does no longer work. Relates to #53100	2020-05-04 22:38:33 +02:00

1 2 3 4 5 ...

865 Commits