OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-15 09:25:40 +00:00

Author	SHA1	Message	Date
Dimitris Athanasiou	e09074d382	[7.x][ML] Fix online updates with custom rules referencing filters (#63057 ) (#63064 ) When an opened anomaly detection job is updated with a detection rule that references a filter, apart from updating the c++ process with the rule, we also need to update it with the referenced filter. This commit fixes a bug which led to the job not applying such updates on-the-fly. Fixes #62948 Backport of #63057	2020-09-30 16:01:06 +03:00
Przemysław Witek	4366d58564	[7.x] [ML] Implement AucRoc metric for classification (#60502 ) (#63051 )	2020-09-30 12:55:52 +02:00
Dimitris Athanasiou	179fe9cc0e	[7.x][ML] Delete dest index and reindex if incompatible (#62960 ) (#63050 ) Data frame analytics results format changed in version `7.10.0`. If existing jobs that were not completed are restarted, it is possible the destination index had already been created. That index's mappings are not suitable for the new results format. This commit checks the version of the destination index and deletes it when the version is outdated. The job will then continue by recreating the destination index and reindexing. Backport of #62960	2020-09-30 12:57:48 +03:00
David Roberts	05427c2bb2	[ML] Add timeouts to named pipe connections (#63022 ) This PR adds timeouts to the named pipe connections of the autodetect, normalize and data_frame_analyzer processes. This argument requires the changes of elastic/ml-cpp#1514 in order to work, so that PR will be merged before this one. (The controller process already had a different mechanism, tied to the ES JVM lifetime.) Backport of #62993	2020-09-29 18:04:02 +01:00
Benjamin Trent	2b9032a07d	[7.x] [ML] fixing testTwoJobsWithSameRandomizeSeedUseSameTrainingSet tests (#62976 ) (#62999 ) * [ML] fixing testTwoJobsWithSameRandomizeSeedUseSameTrainingSet tests (#62976) This fixes the two test failures. The shard failure seems to be due to the .ml-stats index being in the middle of being created.	2020-09-29 08:12:20 -04:00
Dimitris Athanasiou	7f6c1ff5b4	[7.x][ML] Remove top level importance from classification inference results (#62486 ) (#62964 ) As we have decided top level importance for classification is not useful, it has been removed from the results from the training job. This commit also removes them from inference. Backport of #62486	2020-09-29 10:58:48 +03:00
Benjamin Trent	a054e62bc4	[ML] allow datafeeds to run if there are any concrete indices (#62827 ) (#62965 ) This commit allows a datafeed to be assigned to a node if only one index pattern has concrete indices.	2020-09-28 12:58:07 -04:00
Benjamin Trent	c56424f740	[ML] write deprecation warning when include_model_definition parameter is used (#62834 ) (#62885 ) for get trained models include_model_definition is now deprecated. This commit writes a deprecation warning if that parameter is used and suggests the caller to utilize the replacement	2020-09-24 11:38:54 -04:00
Daniel Mitterdorfer	d2166030d1	Mute failing test case in DeleteExpiredDataIT (#62870 ) (#62871 ) Relates #62699	2020-09-24 15:42:52 +02:00
Dimitris Athanasiou	7de5201291	[7.x][ML] Handle data frame analytics state spreading over multiple docs (#62564 ) (#62824 ) When state persistence was first implemented for data frame analytics we had the assumption that state would always fit in a single document. However this is not the case any more. This commit adds handling of state that spreads over multiple documents. Backport of #62564	2020-09-23 16:16:34 +03:00
Dimitris Athanasiou	69e72656fa	[7.x][ML] Reset reindexing progress when DFA job resumes with incomplete reindexing (#62772 ) (#62816 ) This fixes reindexing progress in the scenario when a DFA job that had not finished reindexing is resumed (either because the user called stop and start or because the job was reassigned in the middle of reindexing). Before the fix reindexing progress stays to the value it had reached before until it surpasses that value. When we resume a data frame analytics job we want to preserve reindexing progress and reset all other phases. Except for when reindexing was not completed. In that case we are deleting the destination index and starting reindexing from scratch. Thus we need to reset reindexing progress too. Backport of #62772	2020-09-23 14:09:04 +03:00
Benjamin Trent	77bfb32635	[7.x] [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls (#62694 ) (#62784 ) * [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls (#62694) * [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls global parameters, outside of the global index, are ignored for internal callers in certain cases. If the interal caller is adding requests via the following methods: ``` - BulkRequest#add(IndexRequest) - BulkRequest#add(UpdateRequest) - BulkRequest#add(DocWriteRequest) - BulkRequest#add(DocWriteRequest[]) ``` It is better to specifically set the desired parameters on the requests before they are added to the bulk request object. This commit addresses this issue for the ML plugin * unmuting test	2020-09-22 15:07:08 -04:00
Nik Everett	39a617773d	Raname grok's built-in patterns (backport of #62735 ) (#62765 ) This reworks the code around grok's built-in patterns to name things more like the rest of the code. Its not a big deal, but I'm just more used to having `public static final` constants in SHOUTING_SNAKE_CASE.	2020-09-22 13:06:43 -04:00
Andrei Dan	0be89bcd7f	Mute RegressionIT.testTwoJobsWithSameRandomizeSeedUseSameTrainingSet (#62763 )	2020-09-22 13:43:15 +01:00
Benjamin Trent	0f142c6afc	[ML] all multiple wildcard values for GET Calendars, Events, and DELETE forecasts (#62563 ) (#62629 ) This commit adjusts the following APIs so now they not only support an `_all` case, but wildcard patterned Ids as well. - `GET _ml/calendars/<calendar_id>/events` - `GET _ml/calendars/<calendar_id>` - `GET _ml/anomaly_detectors/<job_id>/model_snapshots/<snapshot_id>` - `DELETE _ml/anomaly_detectors/<job_id>/_forecast/<forecast_id>`	2020-09-18 11:06:07 -04:00
Benjamin Trent	e163559e4c	[7.x] [ML] Add new include flag to GET inference/<model_id> API for model training metadata (#61922 ) (#62620 ) * [ML] Add new include flag to GET inference/<model_id> API for model training metadata (#61922) Adds new flag include to the get trained models API The flag initially has two valid values: definition, total_feature_importance. Consequently, the old include_model_definition flag is now deprecated. When total_feature_importance is included, the total_feature_importance field is included in the model metadata object. Including definition is the same as previously setting include_model_definition=true. * fixing test * Update x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/ml/action/GetTrainedModelsRequestTests.java	2020-09-18 10:07:35 -04:00
Ignacio Vera	6a3d731be1	Only call reduce on a single InternalAggregation when needed (#62525 ) (#62594 ) Adds a new abstract method in InternalAggregation that flags the framework if it needs to reduce on a single InternalAggregation.	2020-09-18 08:43:58 +02:00
Jake Landis	5b7246157f	[7.x] Fix projects that failed to build within Intellij (#62258 ) (#62408 ) This commit address some build failures from the perspective of Intellij. These changes include: * changing an order of a dependency definition that seems to can cause Intellij build to fail. * introduction of an abstract class out of the test source set (seems to be an issue sharing classes cross projects with non-standard source sets. * a couple of missing dependency definitions (not sure how the command line worked prior to this)	2020-09-17 17:45:12 -05:00
Dimitris Athanasiou	7118ff7976	[7.x][ML] Remove model snapshot legacy doc ids (#62434 ) (#62569 ) Removes methods that were no longer used regarding version 5.4 doc ids of ModelState. Also adds clean up of 5.4 model state and quantile docs in the daily maintenance. Backport of #62434	2020-09-17 23:43:28 +03:00
Dimitris Athanasiou	f5c28e2054	[7.x][ML] Do not start data frame analytics when too many docs are analyzed (#62547 ) (#62558 ) The data frame structure in c++ has a limit on 2^32 documents. This commit adds a check that the number of documents involved in the analysis are less than that and fails to start otherwise. That saves the cost of reindexing when it is unnecessary. Backport of #62547	2020-09-17 19:06:38 +03:00
David Kyle	417ce9396d	[ML] Add datafeed run time fields integration test (#62535 ) (#62538 )	2020-09-17 13:41:07 +01:00
Benjamin Trent	341eeae6e7	[ML] fixes testWatchdog test verifying matcher is interrupted on timeout (#62391 ) (#62447 ) Constructing the timout checker FIRST and THEN registering the watcher allows the test to have a race condition. The timeout value could be reached BEFORE the matcher is added. To prevent the matcher never being interrupted, a new timedOut value is added to the watcher thread entry. Then when a new matcher is registered, if the thread was previously timedout, we interrupt the matcher immediately. closes #48861	2020-09-16 09:13:22 -04:00
Benjamin Trent	8d89a28126	[ML] unmuting test for testTooManyPartitions memory check on windows (#62393 ) (#62405 ) This commit unmutes the windows check for testTooManyPartitions test. The assertion has since changed to include a soft_limit check. This coupled with changes over the past years means the test should be enabled again. related to: #32033	2020-09-16 07:03:10 -04:00
David Roberts	e4275f3749	[ML] Use utility thread pool for memory estimation (#62314 ) The job comms thread pool is intended for the long-running job processes that do anomaly detection or data frame analytics and count towards job count and memory limits. This commit moves the short-lived memory estimation processes to the ML utility thread pool. Although this doesn't matter in most cases, at the limits of scale it could mean that memory estimations would get in the way of starting jobs, or would queue up for an excessive period of time while waiting for jobs to finish.	2020-09-14 16:47:12 +01:00
David Roberts	d8288526d9	[ML] Add null checks for C++ log handler (#62238 ) It has been observed that if the normalizer process fails to connect to the JVM then this causes a null pointer exception as the JVM tries to close the native process object. The accessors and close methods of the native process class that access the C++ log handler should not assume that it connected correctly.	2020-09-14 11:28:26 +01:00
David Roberts	969a1c558b	[ML] Include the "properties" layer in find_file_structure mappings (#62158 ) Previously the "mappings" field of the response from the find_file_structure endpoint was not a drop-in for the mappings format of the create index endpoint - the "properties" layer was missing. The reason for omitting it initially was that the assumption was that the find_file_structure endpoint would only ever return very simple mappings without any nested objects. However, this will not be true in the future, as we will improve mappings detection for complex JSON objects. As a first step it makes sense to move the returned mappings closer to the standard format. This is a small building block towards fixing #55616	2020-09-10 09:33:42 +01:00
Jake Landis	d8dad9ab2c	[7.x] Remove integTest task from PluginBuildPlugin (#61879 ) (#62135 ) This commit removes `integTest` task from all es-plugins. Most relevant projects have been converted to use yamlRestTest, javaRestTest, or internalClusterTest in prior PRs. A few projects needed to be adjusted to allow complete removal of this task * x-pack/plugin - converted to use yamlRestTest and javaRestTest * plugins/repository-hdfs - kept the integTest task, but use `rest-test` plugin to define the task * qa/die-with-dignity - convert to javaRestTest * x-pack/qa/security-example-spi-extension - convert to javaRestTest * multiple projects - remove the integTest.enabled = false (yay!) related: #61802 related: #60630 related: #59444 related: #59089 related: #56841 related: #59939 related: #55896	2020-09-09 14:25:41 -05:00
Benjamin Trent	e181e24d48	[ML] only persist progress if it has changed (#62123 ) (#62180 ) * [ML] only persist progress if it has changed We already search for the previously stored progress document. For optimization purposes, and to prevent restoring the same progress after a failed analytics job is stopped, this commit does an equality check between the previously stored progress and current progress If the progress has changed, persistence continues as normal.	2020-09-09 12:04:09 -04:00
Benjamin Trent	057bf3f7d5	[ML] setting require_alias to previous value on bulk index retry (#62103 ) (#62108 ) Previous work has been done to prevent automatically creating a concrete index when an alias is desired. This commit addresses a path where this check was not being done. relates: #62064	2020-09-08 11:38:32 -04:00
David Roberts	b2636678b2	[ML] Add support for date_nanos fields in find_file_structure (#62048 ) Now that #61324 is merged it is possible for the find_file_structure endpoint to suggest using date_nanos fields for timestamps where the timestamp format provides greater than millisecond accuracy.	2020-09-08 13:05:09 +01:00
David Kyle	a5b24bf44c	Mute ClassificationIT (#62063 ) testWithOnlyTrainingRowsAndTrainingPercentIsFifty_DependentVariableIsBoolean For #60759	2020-09-07 16:10:48 +01:00
Dimitris Athanasiou	d37f197efd	[7.x][ML] Allow training_percent to be any positive double up to hundred (#61977 ) (#61990 ) This changes the valid range of `training_percent` for regression and classification from [1, 100] to (0, 100]. Backport of #61977	2020-09-04 17:34:14 +03:00
Benjamin Trent	cec102a391	[7.x] [ML] adds new n_gram_encoding custom processor (#61578 ) (#61935 ) * [ML] adds new n_gram_encoding custom processor (#61578) This adds a new `n_gram_encoding` feature processor for analytics and inference. The focus of this processor is simple ngram encodings that allow: - multiple ngrams [1..5] - Prefix, infix, suffix	2020-09-04 08:36:50 -04:00
Dimitris Athanasiou	bdccab7c7a	[7.x][ML] Add incremental id during data frame analytics reindexing (#61943 ) (#61971 ) Previously, we added a copy of the `_id` during reindexing and sorted the destination index on that. This allowed us to traverse the docs in the destination index in a stable order multiple times and with efficiency. However, the destination index being sorted means we cannot have `nested` typed fields. This is a problem as it does not allow us to provide a good experience with our evaluate API when it comes to computing metrics for specific classes, features, etc. This commit changes the approach in order to result to a destination index that allows nested fields. Instead of adding a copy of the `_id` field, we now add an incremental id that we can use to traverse the docs in a stable order. We also ensure we always assign the same incremental id to the same doc from the source indices by sorting on `_seq_no` during reindexing. That in combination with the reindexing API using scroll gives us a stable order as scroll uses the (`_index`, `_doc`, shard_id) tuple to resolve ties. The extractor now does not need to scroll. Instead we sort on the incremental id and we do ranged searches to avoid the sort-all-docs overhead. Finally, the `TestDocsIterator` is simply changed to search_after the incremental id. With these changes data frame analytics jobs do not use scroll at any part. Having all these in place, the commit adds the `nested` types to the necessary fields of `classification` and `regression` analyses results. Backport of #61943	2020-09-04 13:24:42 +03:00
Tanguy Leroux	c90ee32cdc	Mute ClassificationIT.testTooLowConfiguredMemoryStillStarts (#61915 ) Relates #61913	2020-09-03 15:52:01 +02:00
Dimitris Athanasiou	ec405978fc	[7.x][ML] Update reindexing task progress before persisting job progress (#61868 ) (#61875 ) This fixes a bug introduced by #61782. In that PR I thought I could simplify the persistence of progress by using the progress straight from the stats holder in the task instead of calling the get stats action. However, I overlooked that it is then possible to have stale progress for the reindexing task as that is only updated when the get stats API is called. In this commit this is fixed by updating reindexing task progress before persisting the job progress. This seems to be much more lightweight than calling the get stats request. Closes #61852 Backport of #61868	2020-09-02 21:44:18 +03:00
Benjamin Trent	c22415c241	[7.x] [ML] unmute testTooLowConfiguredMemoryStillStarts (#61846 ) (#61869 ) * [ML] unmute testTooLowConfiguredMemoryStillStarts (#61846) Native PR addresses this test failure: https://github.com/elastic/ml-cpp/pull/1465 closes https://github.com/elastic/elasticsearch/issues/61704 closes https://github.com/elastic/elasticsearch/issues/61561	2020-09-02 13:23:23 -04:00
Jake Landis	794aac717d	[7.x] Convert first 1/2 x-pack plugins from integTest to [yaml \| java]RestTest or internalClusterTest (#60630 ) (#61855 ) For 1/2 the plugins in x-pack, the integTest task is now a no-op and all of the tests are now executed via a test, yamlRestTest, javaRestTest, or internalClusterTest. This includes the following projects: async-search, autoscaling, ccr, enrich, eql, frozen-indicies, data-streams, graph, ilm, mapper-constant-keyword, mapper-flattened, ml A few of the more specialized qa projects within these plugins have not been changed with this PR due to additional complexity which should be addressed separately. A follow up PR will address the remaining x-pack plugins (this PR is big enough as-is). related: #61802 related: #56841 related: #59939 related: #55896	2020-09-02 11:19:24 -05:00
Dimitris Athanasiou	07ab0beea0	[7.x][ML] Improve handling of exception while starting DFA process (#61838 ) (#61847 ) While starting the data frame analytics process it is possible to get an exception before the process crash handler is in place. In addition, right after starting the process, we check the process is alive to ensure we capture a failed process. However, those exceptions are unhandled. This commit catches any exception thrown while starting the process and sets the task to failed with the root cause error message. I have also taken the chance to remove some unused parameters in `NativeAnalyticsProcessFactory`. Relates #61704 Backport of #61838	2020-09-02 16:32:45 +03:00
David Kyle	d268540f20	[ML] Check and install the latest template in the DFA executor (#61589 ) (#61842 ) During a rolling upgrade it is possible that a worker node will be upgraded before the master in which case the DFA templates will not have been installed. Before a DFA task starts check that the latest template is installed and install it if necessary.	2020-09-02 12:16:29 +01:00
Nik Everett	f8158bdb2d	Skip failing test Tracked by https://github.com/elastic/elasticsearch/issues/61561	2020-09-01 13:44:31 -04:00
Dimitris Athanasiou	2547cfbe54	[7.x][ML] Persist progress when setting DFA task to failed (#61782 ) (#61792 ) When an error occurs and we set the task to failed via the `DataFrameAnalyticsTask.setFailed` method we do not persist progress. If the job is later restarted, this means we do not correctly restore from where we can but instead we start the job from scratch and have to redo the reindexing phase. This commit solves this bug by persisting the progress before setting the task to failed. Backport of #61782	2020-09-01 18:33:07 +03:00
Benjamin Trent	7dabaad7d9	[ML] refactor ml job node selection into its own class (#61521 ) (#61747 ) This is a minor refactor where the job node load logic (node availability, etc.) is refactored into its own class. This will allow future things (i.e. autoscaling decisions) to use the same node load detection class. backport of #61521	2020-08-31 14:00:23 -04:00
Henning Andersen	4c9fe31da8	Mute testTooLowConfiguredMemoryStillStarts (#61705 ) Related to #61704	2020-08-31 11:19:53 +02:00
David Kyle	49a5afc6c1	[ML] Increase wait for templates timeout in tests (#61623 ) (#61628 )	2020-08-27 12:57:12 +01:00
David Kyle	25e811ced7	Rewrite Inference yml tests for better clean up (#61180 ) (#61555 ) Inference processors asynchronously usage write stats to the .ml-stats index after they used. In tests the write can leak into the next test causing failures depending on which test follows. This change waits for the usage stats docs to be written at the end of the test	2020-08-27 11:16:26 +01:00
Dimitris Athanasiou	3ed65eb418	[7.x][ML] Recover data frame extraction search from latest sort key (#61544 ) (#61572 ) If a search failure occurs during data frame extraction we catch the error and retry once. However, we retry another search that is identical to the first one. This means we will re-fetch any docs that were already processed. This may result either to training a model using duplicate data or in the case of outlier detection to an error message that the process received more records than it expected. This commit fixes this issue by tracking the latest doc's sort key and then using that in a range query in case we restart the search due to a failure. Backport of #61544 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-08-26 17:54:00 +03:00
Benjamin Trent	a6e7a3d65f	[7.x] [ML] write warning if configured memory limit is too low for analytics job (#61505 ) (#61528 ) Backports the following commits to 7.x: [ML] write warning if configured memory limit is too low for analytics job (#61505) Having `_start` fail when the configured memory limit is too low can be frustrating. We should instead warn the user that their job might not run properly if their configured limit is too low. It might be that our estimate is too high, and their configured limit works just fine.	2020-08-26 10:35:38 -04:00
Przemyslaw Gomulka	9f566644af	Do not create two loggers for DeprecationLogger backport(#58435 ) (#61530 ) DeprecationLogger's constructor should not create two loggers. It was taking parent logger instance, changing its name with a .deprecation prefix and creating a new logger. Most of the time parent logger was not needed. It was causing Log4j to unnecessarily cache the unused parent logger instance. depends on #61515 backports #58435	2020-08-26 16:04:02 +02:00
Przemysław Witek	11c2710e7f	[7.x] [ML] Do not mark the DFA job as FAILED when a failure occurs after the node is shutdown (#61331 ) (#61526 )	2020-08-26 09:53:13 +02:00

1 2 3 4 5 ...

1032 Commits