OpenSearch

Commit Graph

Author	SHA1	Message	Date
Benjamin Trent	2016e23285	[ML] Refactor common utils out of ML plugin to XPack.Core (#39976 ) (#40009 ) * [ML] Refactor common utils out of ML plugin to XPack.Core * implementing GET filters with abstract transport * removing added rest param * adjusting how defaults can be supplied	2019-03-13 17:08:43 -05:00
Dimitris Athanasiou	79e414df86	[ML] Fix datafeed skipping first bucket after lookback when aggs are … (#39859 ) (#39958 ) The problem here was that `DatafeedJob` was updating the last end time searched based on the `now` even though when there are aggregations, the extactor will only search up to the floor of `now` against the histogram interval. This commit fixes the issue by using the end time as calculated by the extractor. It also adds an integration test that uses aggregations. This test would fail before this fix. Unfortunately the test is slow as we need to wait for the datafeed to work in real time. Closes #39842	2019-03-13 09:09:07 +02:00
David Kyle	48788269b0	[ML] Correct small inconsistencies in ml APIs spec and docs (#39907 )	2019-03-11 14:02:50 +00:00
Benjamin Trent	4da04616c9	[ML] refactoring lazy query and agg parsing (#39776 ) (#39881 ) * [ML] refactoring lazy query and agg parsing * Clean up and addressing PR comments * removing unnecessary try/catch block * removing bad call to logger * removing unused import * fixing bwc test failure due to serialization and config migrator test * fixing style issues * Adjusting DafafeedUpdate class serialization * Adding todo for refactor in v8 * Making query non-optional so it does not write a boolean byte	2019-03-10 14:54:02 -05:00
David Roberts	5f8f91c03b	[ML] Use scaling thread pool and xpack.ml.max_open_jobs cluster-wide dynamic (#39736 ) This change does the following: 1. Makes the per-node setting xpack.ml.max_open_jobs into a cluster-wide dynamic setting 2. Changes the job node selection to continue to use the per-node attributes storing the maximum number of open jobs if any node in the cluster is older than 7.1, and use the dynamic cluster-wide setting if all nodes are on 7.1 or later 3. Changes the docs to reflect this 4. Changes the thread pools for native process communication from fixed size to scaling, to support the dynamic nature of xpack.ml.max_open_jobs 5. Renames the autodetect thread pool to the job comms thread pool to make clear that it will be used for other types of ML jobs (data frame analytics in particular) Backport of #39320	2019-03-06 12:29:34 +00:00
Dimitris Athanasiou	5c023770d2	[ML] Disable security audit trail in native integ tests suite (#39683 ) Investigating how to make DeleteExpiredDataIT faster, it was revealed that the security audit trail threads were quite hot. Disabling that seems to be helping quite a bit with making this test faster. This commit also unmutes the test to see how it goes with the audit trail disabled. Relates #39658 Closes #39575	2019-03-05 12:43:15 +02:00
David Kyle	a58145f9e6	[ML] Transition to typeless (mapping) APIs (#39573 ) ML has historically used doc as the single mapping type but reindex in 7.x will change the mapping to _doc. Switching to the typeless APIs handles case where the mapping type is either doc or _doc. This change removes deprecated typed usages.	2019-03-04 13:52:05 +00:00
David Roberts	085ff38122	Mute DeleteExpiredDataIT.testDeleteExpiredData Due to https://github.com/elastic/elasticsearch/issues/39575	2019-03-03 18:34:30 +00:00
Dimitris Athanasiou	8843832039	[ML] Shave off DeleteExpiredDataIT runtime (#39557 ) This commit parallelizes some parts of the test and its remove an unnecessary refresh call. On my local machine it shaves off about 15 seconds for a test execution time of ~64s (down from ~80s). This test is still slow but progress over perfection. Relates #37339	2019-03-01 19:10:00 +02:00
Dimitris Athanasiou	8122650a55	[ML] Add integration test for interim results after advancing bucket (#39447 ) This is an integration test that captures the issue described in elastic/ml-cpp#324	2019-02-28 11:12:08 +02:00
Mehran Koushkebaghi	1d0097b5e8	[ML] Refactoring scheduled event to store instant instead of zoned time zone (#39380 ) The ScheduledEvent class has never preserved the time zone so it makes more sense for it to store the start and end time using Instant rather than ZonedDateTime. Closes #38620	2019-02-27 09:27:04 +00:00
David Roberts	4f2bd238d2	[ML] Increase datafeed integration test timeout for slow machines (#39311 ) The assertBusy() that waits the default 10 seconds for a datafeed to complete very occasionally times out on slow machines. This commit increases the timeout to 60 seconds. It will almost never actually take this long, but it's better to have a timeout that will prevent time being wasted looking at spurious test failures.	2019-02-22 15:35:32 +00:00
Dimitris Athanasiou	1c6818fe74	[ML] Improve DeleteExpiredDataIT failure message (#39298 ) (#39310 ) This test failed once in a very long time with the assertion that there is no document for the `non_existing_job` in the state index. I could not see how that is possible and I cannot reproduce. With this commit the failure message will reveal some examples of the left behind docs which might shed a light about what could go wrong.	2019-02-22 16:15:11 +02:00
Benjamin Trent	109b6451fd	ML refactor DatafeedsConfig(Update) so defaults are not populated in queries or aggs (#38822 ) (#39119 ) * ML refactor DatafeedsConfig(Update) so defaults are not populated in queries or aggs * Addressing pr feedback	2019-02-19 12:45:56 -06:00
David Roberts	35e30b34f9	[ML] Stop the ML memory tracker before closing node (#39111 ) The ML memory tracker does searches against ML results and config indices. These searches can be asynchronous, and if they are running while the node is closing then they can cause problems for other components. This change adds a stop() method to the MlMemoryTracker that waits for in-flight searches to complete. Once stop() has returned the MlMemoryTracker will not kick off any new searches. The MlLifeCycleService now calls MlMemoryTracker.stop() before stopping stopping the node. Fixes #37117	2019-02-19 15:12:40 +00:00
David Roberts	bbcdea43c5	[ML] Allow stop unassigned datafeed and relax unset upgrade mode wait (#39034 ) These two changes are interlinked. Before this change unsetting ML upgrade mode would wait for all datafeeds to be assigned and not waiting for their corresponding jobs to initialise. However, this could be inappropriate, if there was a reason other that upgrade mode why one job was unable to be assigned or slow to start up. Unsetting of upgrade mode would hang in this case. This change relaxes the condition for considering upgrade mode to be unset to simply that an assignment attempt has been made for each ML persistent task that did not fail because upgrade mode was enabled. Thus after unsetting upgrade mode there is no guarantee that every ML persistent task is assigned, just that each is not unassigned due to upgrade mode. In order to make setting upgrade mode work immediately after unsetting upgrade mode it was then also necessary to make it possible to stop a datafeed that was not assigned. There was no particularly good reason why this was not allowed in the past. It is trivial to stop an unassigned datafeed because it just involves removing the persistent task.	2019-02-19 14:07:10 +00:00
David Roberts	b660d2cac6	[ML] More advanced post-test cleanup of ML indices (#39049 ) The .ml-annotations index is created asynchronously when some other ML index exists. This can interfere with the post-test index deletion, as the .ml-annotations index can be created after all other indices have been deleted. This change adds an ML specific post-test cleanup step that runs before the main cleanup and: 1. Checks if any ML indices exist 2. If so, waits for the .ml-annotations index to exist 3. Deletes the other ML indices found in step 1. 4. Calls the super class cleanup This means that by the time the main post-test index cleanup code runs: 1. The only ML index it has to delete will be the .ml-annotations index 2. No other ML indices will exist that could trigger recreation of the .ml-annotations index Fixes #38952	2019-02-18 14:16:03 +00:00
Martijn Laarman	9b4d96534b	Fix #38623 remove xpack namespace REST API (#38625 ) (#39036 ) * Fix #38623 remove xpack namespace REST API Except for xpack.usage and xpack.info API's, this moves the last remaining API's out of the xpack namespace * rename xpack api's inside inside the files as well * updated yaml tests references to xpack namespaces api's * update callsApi calls in the IT subclasses * make sure docs testing does not use xpack namespaced api's * fix leftover xpack namespaced method names in docs/build.gradle * found another leftover reference (cherry picked from commit ccb5d934363c37506b76119ac050a254fa80b5e7)	2019-02-18 12:40:07 +01:00
Dimitris Athanasiou	21f76aba28	[ML] Extract base class for integ tests with native processes (#38850 ) (#38860 )	2019-02-14 12:15:00 +02:00
Benjamin Trent	d2ac05e249	ML allow aliased .ml-anomalies* index on PUT Job (#38821 ) (#38847 )	2019-02-13 10:58:55 -06:00
Benjamin Trent	24a8ea06f5	ML: update set_upgrade_mode, add logging (#38372 ) (#38538 ) * ML: update set_upgrade_mode, add logging * Attempt to fix datafeed isolation Also renamed a few methods/variables for clarity and added some comments	2019-02-08 12:56:04 -06:00
Boaz Leskes	033ba725af	Remove support for internal versioning for concurrency control (#38254 ) Elasticsearch has long [supported](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) compare and set (a.k.a optimistic concurrency control) operations using internal document versioning. Sadly that approach is flawed and can sometime do the wrong thing. Here's the relevant excerpt from the resiliency status page: > When a primary has been partitioned away from the cluster there is a short period of time until it detects this. During that time it will continue indexing writes locally, thereby updating document versions. When it tries to replicate the operation, however, it will discover that it is partitioned away. It won’t acknowledge the write and will wait until the partition is resolved to negotiate with the master on how to proceed. The master will decide to either fail any replicas which failed to index the operations on the primary or tell the primary that it has to step down because a new primary has been chosen in the meantime. Since the old primary has already written documents, clients may already have read from the old primary before it shuts itself down. The version numbers of these reads may not be unique if the new primary has already accepted writes for the same document We recently [introduced](https://www.elastic.co/guide/en/elasticsearch/reference/6.x/optimistic-concurrency-control.html) a new sequence number based approach that doesn't suffer from this dirty reads problem. This commit removes support for internal versioning as a concurrency control mechanism in favor of the sequence number approach. Relates to #1078	2019-02-05 20:53:35 +01:00
David Turner	f2dd5dd6eb	Remove DiscoveryPlugin#getDiscoveryTypes (#38414 ) With this change we no longer support pluggable discovery implementations. No known implementations of `DiscoveryPlugin` actually override this method, so in practice this should have no effect on the wider world. However, we were using this rather extensively in tests to provide the `test-zen` discovery type. We no longer need a separate discovery type for tests as we no longer need to customise its behaviour. Relates #38410	2019-02-05 17:42:24 +00:00
David Roberts	92bc681705	[ML] Report index unavailable instead of waiting for lazy node (#38423 ) If a job cannot be assigned to a node because an index it requires is unavailable and there are lazy ML nodes then index unavailable should be reported as the assignment explanation rather than waiting for a lazy ML node.	2019-02-05 16:10:00 +00:00
Yogesh Gaikwad	fe36861ada	Add support for API keys to access Elasticsearch (#38291 ) X-Pack security supports built-in authentication service `token-service` that allows access tokens to be used to access Elasticsearch without using Basic authentication. The tokens are generated by `token-service` based on OAuth2 spec. The access token is a short-lived token (defaults to 20m) and refresh token with a lifetime of 24 hours, making them unsuitable for long-lived or recurring tasks where the system might go offline thereby failing refresh of tokens. This commit introduces a built-in authentication service `api-key-service` that adds support for long-lived tokens aka API keys to access Elasticsearch. The `api-key-service` is consulted after `token-service` in the authentication chain. By default, if TLS is enabled then `api-key-service` is also enabled. The service can be disabled using the configuration setting. The API keys:- - by default do not have an expiration but expiration can be configured where the API keys need to be expired after a certain amount of time. - when generated will keep authentication information of the user that generated them. - can be defined with a role describing the privileges for accessing Elasticsearch and will be limited by the role of the user that generated them - can be invalidated via invalidation API - information can be retrieved via a get API - that have been expired or invalidated will be retained for 1 week before being deleted. The expired API keys remover task handles this. Following are the API key management APIs:- 1. Create API Key - `PUT/POST /_security/api_key` 2. Get API key(s) - `GET /_security/api_key` 3. Invalidate API Key(s) `DELETE /_security/api_key` The API keys can be used to access Elasticsearch using `Authorization` header, where the auth scheme is `ApiKey` and the credentials, is the base64 encoding of API key Id and API key separated by a colon. Example:- ``` curl -H "Authorization: ApiKey YXBpLWtleS1pZDphcGkta2V5" http://localhost:9200/_cluster/health ``` Closes #34383	2019-02-05 14:21:57 +11:00
David Roberts	fb6a176caf	[ML] Add explanation so far to file structure finder exceptions (#38191 ) The explanation so far can be invaluable for troubleshooting as incorrect decisions made early on in the structure analysis can result in seemingly crazy decisions or timeouts later on. Relates elastic/kibana#29821	2019-02-04 14:32:35 +00:00
Boaz Leskes	ff13a43144	Move ML Optimistic Concurrency Control to Seq No (#38278 ) This commit moves the usage of internal versioning for CAS operations to use sequence numbers and primary terms Relates to #36148 Relates to #10708	2019-02-04 10:41:08 +01:00
David Turner	1d82a6d9f9	Deprecate unused Zen1 settings (#38289 ) Today the following settings in the `discovery.zen` namespace are still used: - `discovery.zen.no_master_block` - `discovery.zen.hosts_provider` - `discovery.zen.ping.unicast.concurrent_connects` - `discovery.zen.ping.unicast.hosts.resolve_timeout` - `discovery.zen.ping.unicast.hosts` This commit deprecates all other settings in this namespace so that they can be removed in the next major version.	2019-02-04 08:52:08 +00:00
Benjamin Trent	5db305023d	ML: Fix error race condition on stop _all datafeeds and close _all jobs (#38113 ) * ML: Ignore when task is not found for _all * Addressing PR comments * Update TransportStopDatafeedAction.java	2019-02-01 11:16:35 -06:00
David Roberts	1fa413a16d	[ML] Remove "8" prefixes from file structure finder timestamp formats (#38016 ) In 7.x Java timestamp formats are the default timestamp format and there is no need to prefix them with "8". (The "8" prefix was used in 6.7 to distinguish Java timestamp formats from Joda timestamp formats.) This change removes the "8" prefixes from timestamp formats in the output of the ML file structure finder.	2019-02-01 15:36:04 +00:00
Benjamin Trent	be381b4525	ML: better handle task state race condition (#38040 )	2019-01-31 11:07:54 -06:00
Henning Andersen	68ed72b923	Handle scheduler exceptions (#38014 ) Scheduler.schedule(...) would previously assume that caller handles exception by calling get() on the returned ScheduledFuture. schedule() now returns a ScheduledCancellable that no longer gives access to the exception. Instead, any exception thrown out of a scheduled Runnable is logged as a warning. This is a continuation of #28667, #36137 and also fixes #37708.	2019-01-31 17:51:45 +01:00
Benjamin Trent	9782aaa1b8	ML: Add reason field in JobTaskState (#38029 ) * ML: adding reason to job failure status * marking reason as nullable * Update AutodetectProcessManager.java	2019-01-30 11:56:24 -06:00
Benjamin Trent	8280a20664	ML: Add upgrade mode docs, hlrc, and fix bug (#37942 ) * ML: Add upgrade mode docs, hlrc, and fix bug * [DOCS] Fixes build error and edits text * adjusting docs * Update docs/reference/ml/apis/set-upgrade-mode.asciidoc Co-Authored-By: benwtrent <ben.w.trent@gmail.com> * Update set-upgrade-mode.asciidoc * Update set-upgrade-mode.asciidoc	2019-01-30 06:51:11 -06:00
Adrien Grand	c8af0f4bfa	Use mappings to format doc-value fields by default. (#30831 ) Doc-value fields now return a value that is based on the mappings rather than the script implementation by default. This deprecates the special `use_field_mapping` docvalue format which was added in #29639 only to ease the transition to 7.x and it is not necessary anymore in 7.0.	2019-01-30 10:31:51 +01:00
Benjamin Trent	34d61d3231	ML: ignore unknown fields for JobTaskState (#37982 )	2019-01-29 12:51:34 -06:00
David Kyle	6d1693ff49	[ML] Prevent submit after autodetect worker is stopped (#37700 ) Runnables can be submitted to AutodetectProcessManager.AutodetectWorkerExecutorService without error after it has been shutdown which can lead to requests timing out as their handlers are never called by the terminated executor. This change throws an EsRejectedExecutionException if a runnable is submitted after after the shutdown and calls AbstractRunnable.onRejection on any tasks not run. Closes #37108	2019-01-29 15:09:40 +00:00
Henrique Gonçalves	eceb3185c7	[ML] Make GetJobStats work with arbitrary wildcards and groups (#36683 ) The /_ml/anomaly_detectors/{job}/_stats endpoint now works correctly when {job} is a wildcard or job group. Closes #34745	2019-01-29 09:06:50 +00:00
Dimitris Athanasiou	ebe9c95230	[ML] Audit all errors during job deletion (#37933 ) This commit moves the auditing of job deletion related errors to the final listener in the job delete action. This ensures any error that occurs during job deletion is audited.	2019-01-29 10:23:50 +02:00
Benjamin Trent	7e4c0e6991	ML: Adds set_upgrade_mode API endpoint (#37837 ) * ML: Add MlMetadata.upgrade_mode and API * Adding tests * Adding wait conditionals for the upgrade_mode call to return * Adding tests * adjusting format and tests * Adjusting wait conditions for api return and msgs * adjusting doc tests * adding upgrade mode tests to black list	2019-01-28 09:07:30 -06:00
David Kyle	c0409fb9f0	[ML] Marginal gains in slow multi node QA tests (#37825 ) Move 2 tests that are simple rest tests and out of the QA suite and cut the number of post data calls in ForecastIT	2019-01-28 10:00:59 +00:00
David Roberts	57d321ed5f	[ML] Tighten up use of aliases rather than concrete indices (#37874 ) We have read and write aliases for the ML results indices. However, the job still had methods that purported to reliably return the name of the concrete results index being used by the job. After reindexing prior to upgrade to 7.x this will be wrong, so the method has been renamed and the comments made more explicit to say the returned index name may not be the actual concrete index name for the lifetime of the job. Additionally, the selection of indices when deleting the job has been changed so that it works regardless of concrete index names. All these changes are nice-to-have for 6.7 and 7.0, but will become critical if we add rolling results indices in the 7.x release stream as 6.7 and 7.0 nodes may have to operate in a mixed version cluster that includes a version that can roll results indices.	2019-01-28 09:38:46 +00:00
David Roberts	f2c0c26d15	[ML] Adjust structure finder for Joda to Java time migration (#37306 ) The ML file structure finder has always reported both Joda and Java time format strings. This change makes the Java time format strings the ones that are incorporated into mappings and ingest pipeline definitions. The BWC syntax of prepending "8" to these formats is used. This will need to be removed once Java time format strings become the default in Elasticsearch. This commit also removes direct imports of Joda classes in the structure finder unit tests. Instead the core Joda BWC class is used.	2019-01-26 20:19:57 +00:00
Benjamin Trent	9e932f4869	ML: removing unnecessary upgrade code (#37879 )	2019-01-25 13:57:41 -06:00
Christoph Büscher	b4b4cd6ebd	Clean codebase from empty statements (#37822 ) * Remove empty statements There are a couple of instances of undocumented empty statements all across the code base. While they are mostly harmless, they make the code hard to read and are potentially error-prone. Removing most of these instances and marking blocks that look empty by intention as such. * Change test, slightly more verbose but less confusing	2019-01-25 14:23:02 +01:00
David Roberts	deafce1acd	[ML] No need to add state doc mapping on job open in 7.x (#37759 ) When upgrading from 5.4 to 5.5 to 6.7 (inclusive) it was necessary to ensure there was a mapping for type "doc" on the ML state index before opening a job. This was because 5.4 created a multi-type ML state index. In version 7.x we can be sure that any such 5.4 index is no longer in use. It would have had to be reindexed into the 6.x index format prior to the upgrade to version 7.x.	2019-01-25 13:15:35 +00:00
Jim Ferenczi	787acb14b9	Track total hits up to 10,000 by default (#37466 ) This commit changes the default for the `track_total_hits` option of the search request to `10,000`. This means that by default search requests will accurately track the total hit count up to `10,000` documents, requests that match more than this value will set the `"total.relation"` to `"gte"` (e.g. greater than or equals) and the `"total.value"` to `10,000` in the search response. Scroll queries are not impacted, they will continue to count the total hits accurately. The default is set back to `true` (accurate hit count) if `rest_total_hits_as_int` is set in the search request. I choose `10,000` as the default because that's also the number we use to limit pagination. This means that users will be able to know how far they can jump (up to 10,000) even if the total number of hits is not accurate. Closes #33028	2019-01-25 13:45:39 +01:00
David Kyle	e1226f69b7	[ML] Increase close job timeout and lower the max number (#37770 )	2019-01-24 09:18:48 +00:00
Lee Hinman	427bc7f940	Use ILM for Watcher history deletion (#37443 ) * Use ILM for Watcher history deletion This commit adds an index lifecycle policy for the `.watch-history-*` indices. This policy is automatically used for all new watch history indices. This does not yet remove the automatic cleanup that the monitoring plugin does for the .watch-history indices, and it does not touch the `xpack.watcher.history.cleaner_service.enabled` setting. Relates to #32041	2019-01-23 10:18:08 -07:00
Alexander Reelsen	daa2ec8a60	Switch mapping/aggregations over to java time (#36363 ) This commit moves the aggregation and mapping code from joda time to java time. This includes field mappers, root object mappers, aggregations with date histograms, query builders and a lot of changes within tests. The cut-over to java time is a requirement so that we can support nanoseconds properly in a future field mapper. Relates #27330	2019-01-23 10:40:05 +01:00

1 2 3 4 5 ...

300 Commits