OpenSearch

Commit Graph

Author	SHA1	Message	Date
Benjamin Trent	3e014d39c2	[Transform] fail to start/put on missing pipeline (#50701 ) (#50795 ) If a pipeline referenced by a transform does not exist, we should not allow the transform to be created. We do allow the pipeline existence check to be skipped with defer_validations, but if the pipeline still does not exist on `_start`, the pipeline will fail to start. relates: #50135	2020-01-09 10:33:22 -05:00
Christoph Büscher	b1b4282273	Make Multiplexer inherit filter chains analysis mode (#50662 ) Currently, if an updateable synonym filter is included in a multiplexer filter, it is not reloaded via the _reload_search_analyzers because the multiplexer itself doesn't pass on the analysis mode of the filters it contains, so its not recognized as "updateable" in itself. Instead we can check and merge the AnalysisMode settings of all filters in the multiplexer and use the resulting mode (e.g. search-time only) for the multiplexer itself, thus making any synonym filters contained in it reloadable. This, of course, will also make the analyzers using the multiplexer be usable at search-time only. Closes #50554	2020-01-08 22:12:01 +01:00
Lee Hinman	8dc6e98819	[7.x] Make InitializePolicyContextStep retryable (#50685 ) (#50760 ) This commits makes the "init" ILM step retryable. It also adds a test where an index is created with a non-parsable index name and then fails. Related to #48183	2020-01-08 13:13:57 -07:00
Andrei Dan	3915d4c055	Make the UpdateRolloverLifecycleDateStep retryable (#50702 ) (#50730 ) This makes the "update-rollover-lifecycle-date" step, which is part of the rollover action, retryable. It also adds an integration test to check the step is retried and it eventually succeeds. (cherry picked from commit 5bf068522deb2b6cd2563bcf80f34fdbf459c9f2) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-01-08 11:45:26 +01:00
Benjamin Trent	060e0a6277	[ML][Inference] Add support for models shipped as resources (#50680 ) (#50700 ) This adds support for models that are shipped as resources in the ML plugin. The first of which is the `lang_ident` model.	2020-01-07 09:21:59 -05:00
Hendrik Muhs	98ca9500e8	implement a workaround for remote cluster validation (#50460 ) In 7.x an internal API used for validating remote cluster does not throw, see #50420 for the details. This change implements a workaround for remote cluster validation, only for 7.x branches. fixes #50420	2020-01-07 13:51:51 +01:00
David Roberts	35453e2b0e	[ML] Improve uniqueness of result document IDs (#50644 ) Switch from a 32 bit Java hash to a 128 bit Murmur hash for creating document IDs from by/over/partition field values. The 32 bit Java hash was not sufficiently unique, and could produce identical numbers for relatively common combinations of by/partition field values such as L018/128 and L017/228. Fixes #50613	2020-01-07 10:24:45 +00:00
Benjamin Trent	5ab9e75e28	[7.x] [ML][Inference] lang_ident model (#50292 ) (#50675 ) * [ML][Inference] lang_ident model (#50292) This PR contains a java port of Google's CLD3 compact NN model https://github.com/google/cld3 The ported model is formatted to fit within our inference model formatting and stored as a resource in the `:xpack:ml:` plugin and is under basic license. The model is broken up into two major parts: - Preprocessing through the custom embedding (based on CLD3's embedding layer) - Pushing the embedded text through the two layers of fully connected shallow NN. Main differences between this port and CLD3: - We take advantage of Java's internal Unicode handling where possible (i.e. codepoints, characters, decoders, etc.) - We do not trim down input text by removing duplicated tokens - We do not encode doubles/floats as longs/integers.	2020-01-06 16:24:03 -05:00
Benjamin Trent	f52af7977d	[ML][Inference] minor cleanup for inference (#50444 ) (#50676 )	2020-01-06 14:05:04 -05:00
Nik Everett	1b28af489f	Fix bare warnings on RollupJobTests (#50633 ) (#50677 ) Silences some ugly warnings.	2020-01-06 14:03:30 -05:00
Albert Zaharovits	9ae3cd2a78	Add 'monitor_snapshot' cluster privilege (#50489 ) (#50647 ) This adds a new cluster privilege `monitor_snapshot` which is a restricted version of `create_snapshot`, granting the same privileges to view snapshot and repository info and status but not granting the actual privilege to create a snapshot. Co-authored-by: j-bean <anton.shuvaev91@gmail.com>	2020-01-06 13:15:55 +02:00
Andrei Dan	3c971f2911	ILM retryable async action steps (#50522 ) (#50591 ) This adds support for retrying AsyncActionSteps by triggering the async step after ILM was moved back on the failed step (the async step we'll be attempting to run after the cluster state reflects ILM being moved back on the failed step). This also marks the RolloverStep as retryable and adds an integration test where the RolloverStep is failing to execute as the rolled over index already exists to test that the async action RolloverStep is retried until the rolled over index is deleted. (cherry picked from commit 8bee5f4cb58a1242cc2ef4bc0317dae6c8be49d3) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-01-03 16:19:58 +02:00
Dimitris Athanasiou	ca0828ba07	[7.x][ML] Implement force deleting a data frame analytics job (#50553 ) (#50589 ) Adds a `force` parameter to the delete data frame analytics request. When `force` is `true`, the action force-stops the jobs and then proceeds to the deletion. This can be used in order to delete a non-stopped job with a single request. Closes #48124 Backport of #50553	2020-01-03 13:46:02 +02:00
Nik Everett	b36a8ab141	Make some ObjectParsers final (#50471 ) (#50556 ) We have about 800 `ObjectParsers` in Elasticsearch, about 700 of which are final. This is probably the right way to declare them because in practice we never mutate them after they are built. And we certainly don't change the static reference. Anyway, this adds `final` to a bunch of these parsers, mostly the ones in xpack and their "paired" parsers in the high level rest client. I picked these just to have somewhere to break the up the change so it wouldn't be huge. I found the non-final parsers with this: ``` diff \ <(find . -type f -name '.java' -exec grep -iHe 'static.PARSER\s=' {} \+ \| sort) \ <(find . -type f -name '.java' -exec grep -iHe 'static.final.PARSER\s*=' {} \+ \| sort) \ 2>&1 \| grep '^<' ```	2020-01-02 10:47:38 -05:00
Tim Vernum	cad0f6bf28	Do not load SSLService in plugin contructor (#50519 ) XPackPlugin created an SSLService within the plugin contructor. This has 2 negative consequences: 1. The service may be constructed based on a partial view of settings. Other plugins are free to add setting values via the additionalSettings() method, but this (necessarily) happens after plugins have been constructed. 2. Any exceptions thrown during the plugin construction are handled differently than exceptions thrown during "createComponents". Since SSL configurations exceptions are relatively common, it is far preferable for them to be thrown and handled as part of the createComponents flow. This commit moves the creation of the SSLService to XPackPlugin.createComponents, and alters the sequence of some other steps to accommodate this change. Backport of: #49667	2019-12-30 14:42:32 +11:00
Lee Hinman	c3c9ccf61f	[7.x] Add ILM histore store index (#50287 ) (#50345 ) * Add ILM histore store index (#50287) * Add ILM histore store index This commit adds an ILM history store that tracks the lifecycle execution state as an index progresses through its ILM policy. ILM history documents store output similar to what the ILM explain API returns. An example document with ALL fields (not all documents will have all fields) would look like: ```json { "@timestamp": 1203012389, "policy": "my-ilm-policy", "index": "index-2019.1.1-000023", "index_age":123120, "success": true, "state": { "phase": "warm", "action": "allocate", "step": "ERROR", "failed_step": "update-settings", "is_auto-retryable_error": true, "creation_date": 12389012039, "phase_time": 12908389120, "action_time": 1283901209, "step_time": 123904107140, "phase_definition": "{\"policy\":\"ilm-history-ilm-policy\",\"phase_definition\":{\"min_age\":\"0ms\",\"actions\":{\"rollover\":{\"max_size\":\"50gb\",\"max_age\":\"30d\"}}},\"version\":1,\"modified_date_in_millis\":1576517253463}", "step_info": "{... etc step info here as json ...}" }, "error_details": "java.lang.RuntimeException: etc\n\tcaused by:etc etc etc full stacktrace" } ``` These documents go into the `ilm-history-1-00000N` index to provide an audit trail of the operations ILM has performed. This history storage is enabled by default but can be disabled by setting `index.lifecycle.history_index_enabled` to `false.` Resolves #49180 * Make ILMHistoryStore.putAsync truly async (#50403) This moves the `putAsync` method in `ILMHistoryStore` never to block. Previously due to the way that the `BulkProcessor` works, it was possible for `BulkProcessor#add` to block executing a bulk request. This was bad as we may be adding things to the history store in cluster state update threads. This also moves the index creation to be done prior to the bulk request execution, rather than being checked every time an operation was added to the queue. This lessens the chance of the index being created, then deleted (by some external force), and then recreated via a bulk indexing request. Resolves #50353	2019-12-20 12:33:36 -07:00
Przemysław Witek	3e3a93002f	[7.x] Fix accuracy metric (#50310 ) (#50433 )	2019-12-20 15:34:38 +01:00
Przemysław Witek	14d95aae46	[7.x] Make each analysis report desired field mappings to be copied (#50219 ) (#50428 )	2019-12-20 15:10:33 +01:00
Przemysław Witek	5bb668b866	[7.x] Get rid of maxClassesCardinality internal parameter (#50418 ) (#50423 )	2019-12-20 14:24:23 +01:00
Hendrik Muhs	40bce49a7f	mute SourceDestValidatorTests.testRemoteSourceDoesNotExist	2019-12-20 11:25:43 +01:00
Hendrik Muhs	7c10e9b0e7	[Transform] improve checkpoint reporting (#50369 ) fixes empty checkpoints, re-factors checkpoint info creation (moves builder) and always reports last change detection relates #43201 relates #50018	2019-12-20 10:49:53 +01:00
Hendrik Muhs	de14092ad2	[Transform] refactor source and dest validation to support CCS (#50018 ) refactors source and dest validation, adds support for CCS, makes resolve work like reindex/search, allow aliased dest index with a single write index. fixes #49988 fixes #49851 relates #43201	2019-12-20 10:49:53 +01:00
Stuart Tettemer	689df1f28f	Scripting: ScriptFactory not required by compile (#50344 ) (#50392 ) Avoid backwards incompatible changes for 8.x and 7.6 by removing type restriction on compile and Factory. Factories may optionally implement ScriptFactory. If so, then they can indicate determinism and thus cacheability. Backport Relates: #49466	2019-12-19 12:50:25 -07:00
Przemysław Witek	cc4bc797f9	[7.x] Implement `precision` and `recall` metrics for classification evaluation (#49671 ) (#50378 )	2019-12-19 18:55:05 +01:00
Benjamin Trent	4396a1f78b	[ML][Inference] fix support for nested fields (#50258 ) (#50335 ) This fixes support for nested fields We now support fully nested, fully collapsed, or a mix of both on inference docs. ES mappings allow the `_source` to be any combination of nested objects + dot delimited fields. So, we should do our best to find the best path down the Map for the desired field.	2019-12-18 15:47:06 -05:00
Dimitris Athanasiou	447bac27d2	[7.x][ML] Delete unused data frame analytics state (#50243 ) (#50280 ) This commit adds removal of unused data frame analytics state from the _delete_expired_data API (and in extend th ML daily maintenance task). At the moment the potential state docs include the progress document and state for regression and classification analyses. Backport of #50243	2019-12-18 12:30:11 +00:00
Armin Braun	2e7b1ab375	Use ClusterState as Consistency Source for Snapshot Repositories (#49060 ) (#50267 ) Follow up to #49729 This change removes falling back to listing out the repository contents to find the latest `index-N` in write-mounted blob store repositories. This saves 2-3 list operations on each snapshot create and delete operation. Also it makes all the snapshot status APIs cheaper (and faster) by saving one list operation there as well in many cases. This removes the resiliency to concurrent modifications of the repository as a result and puts a repository in a `corrupted` state in case loading `RepositoryData` failed from the assumed generation.	2019-12-17 10:55:13 +01:00
Tim Vernum	ce2aab3f2f	Add setting to restrict license types (#50252 ) This adds a new "xpack.license.upload.types" setting that restricts which license types may be uploaded to a cluster. By default all types are allowed (excluding basic, which can only be generated and never uploaded). This setting does not restrict APIs that generate licenses such as the start trial API. This setting is not documented as it is intended to be set by orchestrators and not end users. Backport of: #49418	2019-12-17 14:58:58 +11:00
Benjamin Trent	4805d8ac7d	[ML][Inference] Adding a warning_field for warning msgs. (#49838 ) (#50183 ) This adds a new field for the inference processor. `warning_field` is a place for us to write warnings provided from the inference call. When there are warnings we are not going to write an inference result. The goal of this is to indicate that the data provided was too poor or too different for the model to make an accurate prediction. The user could optionally include the `warning_field`. When it is not provided, it is assumed no warnings were desired to be written. The first of these warnings is when ALL of the input fields are missing. If none of the trained fields are present, we don't bother inferencing against the model and instead provide a warning stating that the fields were missing. Also, this adds checks to not allow duplicated fields during processor creation.	2019-12-13 10:39:51 -05:00
Dimitris Athanasiou	e6cbcf7f7c	[7.x] [ML] Persist/restore state for DFA classification (#50040 ) (#50147 ) This commit adds state persist/restore for data frame analytics classification jobs. Backport of #50040	2019-12-13 10:33:19 +02:00
Tim Vernum	2811b97b76	Remove reserved roles for code search (#50115 ) The "code_user" and "code_admin" reserved roles existed to support code search which is no longer included in Kibana. The "kibana_system" role included privileges to read/write from the code search indices, but no longer needs that access. Backport of: #50068	2019-12-13 10:22:55 +11:00
Benjamin Trent	c043aa887f	[ML][Inference] Simplify inference processor options (#50105 ) (#50146 ) * [ML][Inference] Simplify inference processor options * addressing pr comments	2019-12-12 11:13:55 -05:00
Tim Vernum	47e5e34f42	Support "enterprise" license types (#49474 ) This adds "enterprise" as an acceptable type for a license loaded through the PUT _license API. Internally an enterprise license is treated as having a "platinum" operating mode. The handling of License types was refactored to have a new explicit "LicenseType" enum in addition to the existing "OperatingMode" enum. By default (in 7.x) the GET license API will return "platinum" when an enterprise license is active in order to be compatible with existing consumers of that API. A new "accept_enterprise" flag has been introduced to allow clients to opt-in to receive the correct "enterprise" type. Backport of: #49223	2019-12-12 14:37:44 +11:00
Dimitris Athanasiou	8891f4db88	[7.x][ML] Introduce randomize_seed setting for regression and classification (#49990 ) (#50023 ) This adds a new `randomize_seed` for regression and classification. When not explicitly set, the seed is randomly generated. One can reuse the seed in a similar job in order to ensure the same docs are picked for training. Backport of #49990	2019-12-10 15:29:19 +02:00
Yannick Welsch	a16abf921f	Make elasticsearch-node tools custom metadata-aware (#48390 ) The elasticsearch-node tools allow manipulating the on-disk cluster state. The tool is currently unaware of plugins and will therefore drop custom metadata from the cluster state once the state is written out again (as it skips over the custom metadata that it can't read). This commit preserves unknown customs when editing on-disk metadata through the elasticsearch-node command-line tools.	2019-12-10 09:58:11 +01:00
Jason Tedor	bfb2dc1353	Enable dependent settings values to be validated (#49942 ) Today settings can declare dependencies on another setting. This declaration is implemented so that if the declared setting is not set when the declaring setting is, settings validation fails. Yet, in some cases we want not only that the setting is set, but that it also has a specific value. For example, with the monitoring exporter settings, if xpack.monitoring.exporters.my_exporter.host is set, we not only want that xpack.monitoring.exporters.my_exporter.type is set, but that it is also set to local. This commit extends the settings infrastructure so that this declaration is possible. The use of this in the monitoring exporter settings will be implemented in a follow-up.	2019-12-09 12:45:50 -05:00
Przemysław Witek	0965a10468	[7.x] Pass `prediction_field_type` to C++ analytics process (#49861 ) (#49981 )	2019-12-09 14:43:01 +01:00
Benjamin Trent	049d854360	[ML][Inference] adjust so target_field always has inference result and optionally allow new top classes field in the classification config (#49923 ) (#49982 )	2019-12-09 08:29:45 -05:00
cachedout	549b103458	[7.x] APM system_user (#47668 ) (#49912 ) * Add test for APM beats index perms * Grant monitoring index privs to apm_system user * Review feedback * Fix compilation problem	2019-12-09 08:25:03 +00:00
Armin Braun	ac2774c9fa	Use Cluster State to Track Repository Generation (#49729 ) (#49976 ) Step on the road to #49060. This commit adds the logic to keep track of a repository's generation across repository operations. See changes to package level Javadoc for the concrete changes in the distributed state machine. It updates the write side of new repository generations to be fully consistent via the cluster state. With this change, no `index-N` will be overwritten for the same repository ever. So eventual consistency issues around conflicting updates to the same `index-N` are not a possibility any longer. With this change the read side will still use listing of repository contents instead of relying solely on the cluster state contents. The logic for that will be introduced in #49060. This retains the ability to externally delete the contents of a repository and continue using it afterwards for the time being. In #49060 the use of listing to determine the repository generation will be removed in all cases (except for full-cluster restart) as the last step in this effort.	2019-12-09 09:02:57 +01:00
Stuart Tettemer	17cda5b2c0	Scripting: Groundwork for caching script results (#49895 ) (#49944 ) In order to cache script results in the query shard cache, we need to check if scripts are deterministic. This change adds a default method to the script factories, `isResultDeterministic() -> false` which is used by the `QueryShardContext`. Script results were never cached and that does not change here. Future changes will implement this method based on whether the results of the scripts are deterministic or not and therefore cacheable. Refs: #49466 Backport	2019-12-06 15:08:05 -07:00
Lee Hinman	8205cdd423	[7.x] Refactor IndexLifecycleRunner to split state modificatio… (#49936 ) This commit refactors the `IndexLifecycleRunner` to split out and consolidate the number of methods that change state from within ILM. It adds a new class `IndexLifecycleTransition` that contains a number of static methods used to modify ILM's state. These methods all return new cluster states rather than making changes themselves (they can be thought of as helpers for modifying ILM state). Rather than having multiple ways to move an index to a particular step (like `moveClusterStateToStep`, `moveClusterStateToNextStep`, `moveClusterStateToPreviouslyFailedStep`, etc (there are others)) this now consolidates those into three with (hopefully) useful names: - `moveClusterStateToStep` - `moveClusterStateToErrorStep` - `moveClusterStateToPreviouslyFailedStep` In the move, I was also able to consolidate duplicate or redundant arguments to these functions. Prior to this commit there were many calls that provided duplicate information (both `IndexMetaData` and `LifecycleExecutionState` for example) where the duplicate argument could be derived from a previous argument with no problems. With this split, `IndexLifecycleRunner` now contains the methods used to actually run steps as well as the methods that kick off cluster state updates for state transitions. `IndexLifecycleTransition` contains only the helpers for constructing new states from given scenarios. This also adds Javadocs to all methods in both `IndexLifecycleRunner` and `IndexLifecycleTransition` (this accounts for almost all of the increase in code lines for this commit). It also makes all methods be as restrictive in visibility, to limit the scope of where they are used. This refactoring is part of work towards capturing actions and transitions that ILM makes, by consolidating and simplifying the places we make state changes, it will make adding operation auditing easier.	2019-12-06 12:55:16 -07:00
Hendrik Muhs	c33be29dc7	[Transform] automatic deletion of old checkpoints (#49496 ) add automatic deletion of old checkpoints based on count and time	2019-12-04 07:55:57 +01:00
Hendrik Muhs	7aae212287	[Transform] Fix possible audit logging disappearance after rolling upgrade (#49731 ) (#49767 ) ensure audit index template is available during a rolling upgrade before a transform task can write to it. fixes #49730	2019-12-03 18:05:06 +01:00
Przemysław Witek	a3f88595d7	A few cleanups in evaluation tests (#49791 ) (#49794 )	2019-12-03 15:48:39 +01:00
Dimitris Athanasiou	4edb2e7bb6	[7.x][ML] Add optional source filtering during data frame reindexing (#49690 ) (#49718 ) This adds a `_source` setting under the `source` setting of a data frame analytics config. The new `_source` is reusing the structure of a `FetchSourceContext` like `analyzed_fields` does. Specifying includes and excludes for source allows selecting which fields will get reindexed and will be available in the destination index. Closes #49531 Backport of #49690	2019-11-29 16:10:44 +02:00
Armin Braun	813b49adb4	Make BlobStoreRepository Aware of ClusterState (#49639 ) (#49711 ) * Make BlobStoreRepository Aware of ClusterState (#49639) This is a preliminary to #49060. It does not introduce any substantial behavior change to how the blob store repository operates. What it does is to add all the infrastructure changes around passing the cluster service to the blob store, associated test changes and a best effort approach to tracking the latest repository generation on all nodes from cluster state updates. This brings a slight improvement to the consistency by which non-master nodes (or master directly after a failover) will be able to determine the latest repository generation. It does not however do any tricky checks for the situation after a repository operation (create, delete or cleanup) that could theoretically be used to get even greater accuracy to keep this change simple. This change does not in any way alter the behavior of the blobstore repository other than adding a better "guess" for the value of the latest repo generation and is mainly intended to isolate the actual logical change to how the repository operates in #49060	2019-11-29 14:57:47 +01:00
Ioannis Kakavas	a59b7e07f1	Use PEM files instead of a JKS for key material (#49625 ) (#49701 ) So that the tests can also run in a FIPS 140 JVM, where using a JKS keystore is not allowed. Resolves: #49261	2019-11-29 09:43:55 +02:00
Tim Vernum	e6f530c167	Improved diagnostics for TLS trust failures (#49669 ) - Improves HTTP client hostname verification failure messages - Adds "DiagnosticTrustManager" which logs certificate information when trust cannot be established (hostname failure, CA path failure, etc) These diagnostic messages are designed so that many common TLS problems can be diagnosed based solely (or primarily) on the elasticsearch logs. These diagnostics can be disabled by setting xpack.security.ssl.diagnose.trust: false Backport of: #48911	2019-11-29 15:01:20 +11:00
Przemysław Witek	1425e30b1e	[7.x] Remove ClassInfo interface and BinaryClassInfo class. (#49649 ) (#49681 )	2019-11-28 21:46:46 +01:00
Przemyslaw Gomulka	e528b41cf2	Enable LicenceServiceTests for all jdks (#49440 ) backport(#49682 ) This test no longer relies on jdk version, so the assume should be removed relates #48209	2019-11-28 15:26:54 +01:00
Tim Vernum	901c64ebbf	Add Debug/Trace logging for authentication (#49619 ) Authentication has grown more complex with the addition of new realm types and authentication methods. When user authentication does not behave as expected it can be difficult to determine where and why it failed. This commit adds DEBUG and TRACE logging at key points in the authentication flow so that it is possible to gain addition insight into the operation of the system. Backport of: #49575	2019-11-27 16:39:07 +11:00
j-bean	048b9dbb14	Fix expired job results deletion audit message (#49560 ) The PR fixes #49549	2019-11-26 10:48:12 +00:00
Tim Brooks	416178c7c8	Enable simple remote connection strategy (#49561 ) This commit back ports three commits related to enabling the simple connection strategy. Allow simple connection strategy to be configured (#49066) Currently the simple connection strategy only exists in the code. It cannot be configured. This commit moves in the direction of allowing it to be configured. It introduces settings for the addresses and socket count. Additionally it introduces new settings for the sniff strategy so that the more generic number of connections and seed node settings can be deprecated. The simple settings are not yet registered as the registration is dependent on follow-up work to validate the settings. Ensure at least 1 seed configured in remote test (#49389) This fixes #49384. Currently when we select a random subset of seed nodes from a list, it is possible for 0 seeds to be selected. This test depends on at least 1 seed being selected. Add the simple strategy to cluster settings (#49414) This is related to #49067. This commit adds the simple connection strategy settings and strategy mode setting to the cluster settings registry. With these changes, the simple connection mode can be used. Additionally, it adds validation to ensure that settings cannot be misconfigured.	2019-11-25 16:53:07 -07:00
David Roberts	62811c2272	[ML] Add default categorization analyzer definition to ML info (#49545 ) The categorization job wizard in the ML UI will use this information when showing the effect of the chosen categorization analyzer on a sample of input.	2019-11-25 13:39:16 +00:00
Dimitris Athanasiou	8eaee7cbdc	[7.x][ML] Explain data frame analytics API (#49455 ) (#49504 ) This commit replaces the _estimate_memory_usage API with a new API, the _explain API. The API consolidates information that is useful before creating a data frame analytics job. It includes: - memory estimation - field selection explanation Memory estimation is moved here from what was previously calculated in the _estimate_memory_usage API. Field selection is a new feature that explains to the user whether each available field was selected to be included or not in the analysis. In the case it was not included, it also explains the reason why. Backport of #49455	2019-11-22 22:06:10 +02:00
Benjamin Trent	276b6c67f4	[ML][Inference] Fixing pre-processor value handling and size estimate (#49270 ) (#49489 ) * [ML][Inference] Fixing pre-processor value handling and size estimate * fixing npe	2019-11-22 08:14:33 -05:00
Tim Vernum	2e5f2dd1e1	Deprecate misconfigured SSL server config (#49280 ) This commit adds a deprecation warning when starting a node where either of the server contexts (xpack.security.transport.ssl and xpack.security.http.ssl) meet either of these conditions: 1. The server lacks a certificate/key pair (i.e. neither ssl.keystore.path not ssl.certificate are configured) 2. The server has some ssl configuration, but ssl.enabled is not specified. This new validation does not care whether ssl.enabled is true or false (though other validation might), it simply makes it an error to configure server SSL without being explicit about whether to enable that configuration. Backport of: #45892	2019-11-22 12:14:55 +11:00
Benjamin Trent	a7477ad7c3	[7.x] [ML][Inference] compressing model definition and lazy parsing (#49269 ) (#49446 ) * [ML][Inference] compressing model definition and lazy parsing (#49269) * [ML][Inference] compressing model definition and lazy parsing * addressing PR comments * adding commons io * implementing simplified bounded stream * adjusting for type inclusion	2019-11-21 15:32:32 -05:00
Benjamin Trent	d9835f7fb4	[ML] Fix r_squared eval when variance is 0 (#49439 ) (#49445 )	2019-11-21 11:22:16 -05:00
Benjamin Trent	d41b2e3f38	[ML][Inference] allowing per-model licensing (#49398 ) (#49435 ) * [ML][Inference] allowing per-model licensing * changing to internal action + removing pre-mature opt	2019-11-21 09:46:34 -05:00
Przemysław Witek	c7ac2011eb	[7.x] Implement accuracy metric for multiclass classification (#47772 ) (#49430 )	2019-11-21 15:01:18 +01:00
Mark Tozzi	17358b5af7	(refactor) Extract Empty/Script/Missing ValuesSource behavior to an interface (#48320 ) (#49330 ) This is a pure code rearrangement refactor. Logic for what specific ValuesSource instance to use for a given type (e.g. script or field) moved out of ValuesSourceConfig and into CoreValuesSourceType (previously just ValueSourceType; we extract an interface for future extensibility). ValueSourceConfig still selects which case to use, and then the ValuesSourceType instance knows how to construct the ValuesSource for that case.	2019-11-19 16:44:29 -05:00
Armin Braun	0acba44a2e	Make Repository.getRepositoryData an Async API (#49299 ) (#49312 ) This API call in most implementations is fairly IO heavy and slow so it is more natural to be async in the first place. Concretely though, this change is a prerequisite of #49060 since determining the repository generation from the cluster state introduces situations where this call would have to wait for other operations to finish. Doing so in a blocking manner would break `SnapshotResiliencyTests` and waste a thread. Also, this sets up the possibility to in the future make use of async IO where provided by the underlying Repository implementation. In a follow-up `SnapshotsService#getRepositoryData` will be made async as well (did not do it here, since it's another huge change to do so). Note: This change for now does not alter the threading behaviour in any way (since `Repository#getRepositoryData` isn't forking) and is purely mechanical.	2019-11-19 16:49:12 +01:00
David Roberts	a5204c1c80	[ML] Fixes for stop datafeed edge cases (#49284 ) The following edge cases were fixed: 1. A request to force-stop a stopping datafeed is no longer ignored. Force-stop is an important recovery mechanism if normal stop doesn't work for some reason, and needs to operate on a datafeed in any state other than stopped. 2. If the node that a datafeed is running on is removed from the cluster during a normal stop then the stop request is retried (and will likely succeed on this retry by simply cancelling the persistent task for the affected datafeed). 3. If there are multiple simultaneous force-stop requests for the same datafeed we no longer fail the one that is processed second. The previous behaviour was wrong as stopping a stopped datafeed is not an error, so stopping a datafeed twice simultaneously should not be either. Backport of #49191	2019-11-19 10:51:46 +00:00
Julie Tibshirani	a0ee6c8f7e	Add telemetry for flattened fields. (#48972 ) (#49125 ) Currently we just record the number of flattened fields defined in the mappings.	2019-11-18 12:29:42 -08:00
Benjamin Trent	eefe7688ce	[7.x][ML] ML Model Inference Ingest Processor (#49052 ) (#49257 ) * [ML] ML Model Inference Ingest Processor (#49052) * [ML][Inference] adds lazy model loader and inference (#47410) This adds a couple of things: - A model loader service that is accessible via transport calls. This service will load in models and cache them. They will stay loaded until a processor no longer references them - A Model class and its first sub-class LocalModel. Used to cache model information and run inference. - Transport action and handler for requests to infer against a local model Related Feature PRs: * [ML][Inference] Adjust inference configuration option API (#47812) * [ML][Inference] adds logistic_regression output aggregator (#48075) * [ML][Inference] Adding read/del trained models (#47882) * [ML][Inference] Adding inference ingest processor (#47859) * [ML][Inference] fixing classification inference for ensemble (#48463) * [ML][Inference] Adding model memory estimations (#48323) * [ML][Inference] adding more options to inference processor (#48545) * [ML][Inference] handle string values better in feature extraction (#48584) * [ML][Inference] Adding _stats endpoint for inference (#48492) * [ML][Inference] add inference processors and trained models to usage (#47869) * [ML][Inference] add new flag for optionally including model definition (#48718) * [ML][Inference] adding license checks (#49056) * [ML][Inference] Adding memory and compute estimates to inference (#48955) * fixing version of indexed docs for model inference	2019-11-18 13:19:17 -05:00
Przemysław Witek	5f9965e4b8	Lower minimum model memory limit value from 1MB to 1kB. (#49227 ) (#49242 )	2019-11-18 14:58:20 +01:00
Hendrik Muhs	ca912624ec	[Transform] improve error handling of script errors (#48887 ) improve error handling for script errors, treating it as irrecoverable errors which puts the task immediately into failed state, also improves the error extraction to properly report the script error. fixes #48467	2019-11-18 10:24:39 +01:00
Dimitris Athanasiou	805c31e19e	[7.x][ML] Avoid NPE when node load is calculated on job assignment (#49186 ) (#49214 ) This commit fixes a NPE problem as reported in #49150. But this problem uncovered that we never added proper handling of state for data frame analytics tasks. In this commit we improve the `MlTasks.getDataFrameAnalyticsState` method to handle null tasks and state tasks properly. Closes #49150 Backport of #49186	2019-11-18 10:33:07 +02:00
Rory Hunter	c46a0e8708	Apply 2-space indent to all gradle scripts (#49071 ) Backport of #48849. Update `.editorconfig` to make the Java settings the default for all files, and then apply a 2-space indent to all `*.gradle` files. Then reformat all the files.	2019-11-14 11:01:23 +00:00
Armin Braun	25e05b0013	Fix X-Pack SchedulerEngine Shutdown (#48951 ) (#49054 ) We can have a race here where `scheduleNextRun` executes concurrently to `stop` and so we run into a `RejectedExecutionException` that we don't catch and thus it fails tests. => Fixed by ignoring these so long as they coincide with a scheduler shutdown	2019-11-13 22:06:55 +01:00
Lee Hinman	5eb37c29fe	[7.x] Re-read policy phase JSON when using ILM's move-to-step… (#49011 ) When using the move-to-step API, we should reread the phase JSON from the latest version of the ILM policy. This allows a user to move to the same step while re-reading the policy's latest version. For example, when changing rollover criteria. While manually messing around with some other things I discovered that we only reread the policy when using the retry API, not the move-to-step API. This commit changes the move-to-step API to always read the latest version of the policy.	2019-11-12 19:41:06 -07:00
Benjamin Trent	46ab1db54f	[7.x] [ML] Add new geo_results.(actual_point\|typical_point) fields for `lat_long` results (#47050 ) (#48958 ) * [ML] Add new geo_results.(actual_point\|typical_point) fields for `lat_long` results (#47050) [ML] Add new geo_results.(actual_point\|typical_point) fields for `lat_long` results (#47050) Related PR: https://github.com/elastic/ml-cpp/pull/809 * adjusting bwc version	2019-11-11 15:43:03 -05:00
Hendrik Muhs	5ecde37a68	[7.x][Transform] decouple task and indexer (#48812 ) decouple TransformTask and ClientTransformIndexer. Interaction between the 2 classes are now moved into a context class which holds shared information. relates #45369	2019-11-01 19:39:35 +01:00
Ioannis Kakavas	99aedc844d	Copy http headers to ThreadContext strictly (#45945 ) (#48675 ) Previous behavior while copying HTTP headers to the ThreadContext, would allow multiple HTTP headers with the same name, handling only the first occurrence and disregarding the rest of the values. This can be confusing when dealing with multiple Headers as it is not obvious which value is read and which ones are silently dropped. According to RFC-7230, a client must not send multiple header fields with the same field name in a HTTP message, unless the entire field value for this header is defined as a comma separated list or this specific header is a well-known exception. This commits changes the behavior in order to be more compliant to the aforementioned RFC by requiring the classes that implement ActionPlugin to declare if a header can be multi-valued or not when registering this header to be copied over to the ThreadContext in ActionPlugin#getRestHeaders. If the header is allowed to be multivalued, then all such headers are read from the HTTP request and their values get concatenated in a comma-separated string. If the header is not allowed to be multivalued, and the HTTP request contains multiple such Headers with different values, the request is rejected with a 400 status.	2019-10-31 23:05:12 +02:00
Andrei Dan	ffe5d5417f	ILM Make the `check-rollover-ready` step retryable (#48256 ) (#48740 ) This adds the infrastructure to be able to retry the execution of retryable steps and makes the `check-rollover-ready` retryable as an initial step to make the rollover action more resilient to transient errors. (cherry picked from commit 454020ac8acb147eae97acb4ccd6fb470d1e5f48) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-10-31 11:28:55 +00:00
Lee Hinman	2d5291cf3b	Un-AwaitsFix and enhance logging for testPolicyCRUD (#48719 ) * Un-AwaitsFix and enhance logging for testPolicyCRUD This removes the `AwaitsFix` and increases the test logging for `SnapshotLifecycleServiceTests.testPolicyCRUD` in an effort to track down the cause of #44997. * Remove unused import	2019-10-30 17:02:57 -06:00
Benjamin Trent	c9ead80c31	[7.x] [ML][Inference] separating definition and config object storage (#48651 ) (#48695 ) * [ML][Inference] separating definition and config object storage (#48651) This separates out the `definition` object from being stored within the configuration object in the index. This allows us to gather the config object without decompressing a potentially large definition. Additionally, `input` is moved to the TrainedModelConfig object and out of the definition. This is so the trained input fields are accessible outside the potentially large model definition.	2019-10-30 13:27:29 -04:00
Armin Braun	52e5ceb321	Restore from Individual Shard Snapshot Files in Parallel (#48110 ) (#48686 ) Make restoring shard snapshots run in parallel on the `SNAPSHOT` thread-pool.	2019-10-30 14:36:30 +01:00
Andrei Dan	8b22e297ed	ILM open/close steps are noop if idx is open/close (#48614 ) (#48640 ) The open and close follower steps didn't check if the index is open, closed respectively, before executing the open/close request. This changes the steps to check the index state and only perform the open/close operation if the index is not already open/closed.	2019-10-29 17:43:56 +00:00
Przemysław Witek	7c944d26c5	[7.x] Assert that the results of classification analysis can be evaluated using _evaluate API. (#48626 ) (#48634 )	2019-10-29 16:20:56 +01:00
Benjamin Trent	6ea59dd428	[ML][Transforms] add wait_for_checkpoint flag to stop (#47935 ) (#48591 ) Adds `wait_for_checkpoint` for `_stop` API.	2019-10-28 13:02:57 -04:00
Jim Ferenczi	7fc413c22c	Resolve the role query and the number of docs lazily (#48036 ) This commit ensures that the creation of a DocumentSubsetReader does not eagerly resolve the role query and the number of docs that match. We want to delay this expensive operation in order to ensure that we really need this information when we build it. For this reason the role query and the number of docs are now resolved on demand. This commit also depends on https://issues.apache.org/jira/browse/LUCENE-9003 that will also compute the global number of docs lazily.	2019-10-25 18:12:29 +02:00
Tim Brooks	c0b545f325	Make BytesReference an interface (#48486 ) BytesReference is currently an abstract class which is extended by various implementations. This makes it very difficult to use the delegation pattern. The implication of this is that our releasable BytesReference is a PagedBytesReference type and cannot be used as a generic releasable bytes reference that delegates to any reference type. This commit makes BytesReference an interface and introduces an AbstractBytesReference for common functionality.	2019-10-24 15:39:30 -06:00
Michael Basnight	d49958cef3	Remove deprecated test from the HLRC tests (#48424 ) The AbstractHlrcWriteableXContentTestCase was replaced by a better test case a while ago, and this is the last two instances using it. They have been converted and the test is now deleted. Ref #39745	2019-10-24 14:02:04 -05:00
Ioannis Kakavas	c6b733f1b4	Add populate_user_metadata in OIDC realm (#48357 ) (#48438 ) Make populate_user_metadata configuration parameter available in the OpenID Connect authentication realm Resolves: #48217	2019-10-24 09:51:08 +03:00
Jake Landis	cf175da5a9	Ensure SLM stats does not block an in-place upgrade from 7.4 (… (#48411 ) 7.5+ for SLM requires [stats] object to exist in the cluster state. When doing an in-place upgrade from 7.4 to 7.5+ [stats] does not exist in cluster state, result in an exception on startup [1]. This commit moves the [stats] to be an optional object in the parser and if not found will default to an empty stats object. [1] Caused by: java.lang.IllegalArgumentException: Required [stats]	2019-10-23 11:21:39 -05:00
Przemyslaw Gomulka	aaa6209be6	[7.x] [Java.time] Calculate week of a year with ISO rules BACKPORT(#48209 ) (#48349 ) Reverting the change introducing IsoLocal.ROOT and introducing IsoCalendarDataProvider that defaults start of the week to Monday and requires minimum 4 days in first week of a year. This extension is using java SPI mechanism and defaults for Locale.ROOT only. It require jvm property java.locale.providers to be set with SPI,COMPAT closes #41670 backport #48209	2019-10-23 17:39:38 +02:00
Armin Braun	7215201406	Track Shard-Snapshot Index Generation at Repository Root (#48371 ) This change adds a new field `"shards"` to `RepositoryData` that contains a mapping of `IndexId` to a `String[]`. This string array can be accessed by shard id to get the generation of a shard's shard folder (i.e. the `N` in the name of the currently valid `/indices/${indexId}/${shardId}/index-${N}` for the shard in question). This allows for creating a new snapshot in the shard without doing any LIST operations on the shard's folder. In the case of AWS S3, this saves about 1/3 of the cost for updating an empty shard (see #45736) and removes one out of two remaining potential issues with eventually consistent blob stores (see #38941 ... now only the root `index-${N}` is determined by listing). Also and equally if not more important, a number of possible failure modes on eventually consistent blob stores like AWS S3 are eliminated by moving all delete operations to the `master` node and moving from incremental naming of shard level index-N to uuid suffixes for these blobs. This change moves the deleting of the previous shard level `index-${uuid}` blob to the master node instead of the data node allowing for a safe and consistent update of the shard's generation in the `RepositoryData` by first updating `RepositoryData` and then deleting the now unreferenced `index-${newUUID}` blob. __No deletes are executed on the data nodes at all for any operation with this change.__ Note also: Previous issues with hanging data nodes interfering with master nodes are completely impossible, even on S3 (see next section for details). This change changes the naming of the shard level `index-${N}` blobs to a uuid suffix `index-${UUID}`. The reason for this is the fact that writing a new shard-level `index-` generation blob is not atomic anymore in its effect. Not only does the blob have to be written to have an effect, it must also be referenced by the root level `index-N` (`RepositoryData`) to become an effective part of the snapshot repository. This leads to a problem if we were to use incrementing names like we did before. If a blob `index-${N+1}` is written but due to the node/network/cluster/... crashes the root level `RepositoryData` has not been updated then a future operation will determine the shard's generation to be `N` and try to write a new `index-${N+1}` to the already existing path. Updates like that are problematic on S3 for consistency reasons, but also create numerous issues when thinking about stuck data nodes. Previously stuck data nodes that were tasked to write `index-${N+1}` but got stuck and tried to do so after some other node had already written `index-${N+1}` were prevented form doing so (except for on S3) by us not allowing overwrites for that blob and thus no corruption could occur. Were we to continue using incrementing names, we could not do this. The stuck node scenario would either allow for overwriting the `N+1` generation or force us to continue using a `LIST` operation to figure out the next `N` (which would make this change pointless). With uuid naming and moving all deletes to `master` this becomes a non-issue. Data nodes write updated shard generation `index-${uuid}` and `master` makes those `index-${uuid}` part of the `RepositoryData` that it deems correct and cleans up all those `index-` that are unused. Co-authored-by: Yannick Welsch <yannick@welsch.lu> Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>	2019-10-23 10:58:26 +01:00
Ioannis Kakavas	24e43dfa34	[7.x] Refactor FIPS BootstrapChecks to simple checks (#47499 ) (#48333 ) FIPS 140 bootstrap checks should not be bootstrap checks as they are always enforced. This commit moves the validation logic within the security plugin. The FIPS140SecureSettingsBootstrapCheck was not applicable as the keystore was being loaded on init, before the Bootstrap checks were checked, so an elasticsearch keystore of version < 3 would cause the node to fail in a FIPS 140 JVM before the bootstrap check kicked in, and as such hasn't been migrated. Resolves: #34772	2019-10-22 12:49:01 +03:00
Martijn van Groningen	0ec0ab64c9	Fix executing enrich policies stats (#48132 ) The enrich stats api picked the wrong task to be displayed in the executing stats section. In case `wait_for_completion` was set to `false` then no task was being displayed and if that param was set to `true` then the wrong task was being displayed (transport action task instead of enrich policy executor task). Testing executing policies in enrich stats api is tricky. I have verified locally that this commit fixes the bug.	2019-10-22 07:41:56 +02:00
James Baiera	0d12ef8958	Add Enrich Origin (#48098 ) (#48312 ) This PR adds an origin for the Enrich feature, and modifies the background maintenance task to use the origin when executing client operations. Without this fix, the maintenance task fails to execute when security is enabled.	2019-10-21 16:40:49 -04:00
Przemysław Witek	2db2b945ec	[7.x] Change format of MulticlassConfusionMatrix result to be more self-explanatory (#48174 ) (#48294 )	2019-10-21 22:07:19 +02:00
Przemysław Witek	1a42e37070	[7.x] Default "prediction_field_name" to (dependent_variable + "_prediction") (#48232 ) (#48279 )	2019-10-21 13:18:08 +02:00
Benjamin Trent	876f4aafac	[ML] Add logistic_regression output aggregator (#48238 ) (#48244 )	2019-10-18 10:08:17 -04:00
rsarawgi	5e4dd0fd2e	[ML] Removing usages of ToXContentParams.INCLUDE_TYPE (#48165 ) Removing the option of ToXContentParams.INCLUDE_TYPE and replacing them with ToXContentParams.FOR_INTERNAL_STORAGE Closes #48057	2019-10-18 14:49:26 +01:00
Armin Braun	04e3316408	Stop Resolving Fallback IndexId (#48141 ) (#48204 ) There is no reason to still resolve the fallback `IndexId` here. It only applies to `2.x` repos and those we can't read anymore anyway because they use an `/index` instead of an `/index-N` blob at the repo root for which at least 7.x+ does not contain the logic to find it.	2019-10-17 19:27:49 +02:00
Przemysław Witek	28f68fa221	Make num_top_classes parameter's default value equal to 2 (#48119 ) (#48201 )	2019-10-17 18:43:15 +02:00
Martijn van Groningen	a5fe69c344	Include enrich into the info api as feature (#48157 ) This commit also fixes a bug, the enrich enabled setting was not included in the list of settings. Backport of #48109	2019-10-17 09:51:32 +02:00
Lee Hinman	5af66d79ef	Add SLM support to xpack usage and info APIs (#48149 ) * Add SLM support to xpack usage and info APIs This is a backport of #48096 This adds the missing xpack usage and info information into the `/_xpack` and `/_xpack/usage` APIs. The output now looks like: ``` GET /_xpack/usage { ... "slm" : { "available" : true, "enabled" : true, "policy_count" : 1, "policy_stats" : { "retention_runs" : 0, ... } } ``` and ``` GET /_xpack { ... "features" : { ... "slm" : { "available" : true, "enabled" : true }, ... } } ``` Relates to #43663 * Fix missing license	2019-10-16 21:06:27 -06:00
Benjamin Trent	0dddbb5b42	[ML] Parse and index inference model (#48016 ) (#48152 ) This adds parsing an inference model as a possible result of the analytics process. When we do parse such a model we persist a `TrainedModelConfig` into the inference index that contains additional metadata derived from the running job.	2019-10-16 15:46:20 -04:00
Przemysław Witek	8f815240b3	[7.x] Allow integer types for classification's dependent variable (#47902 ) (#48080 )	2019-10-16 11:09:56 +02:00
Przemysław Witek	eaa56344b5	Verify that the failure reason of analytics process is empty (#48042 ) (#48071 )	2019-10-15 18:33:20 +02:00
Martijn van Groningen	aff0c9babc	This commits merges (#48040 ) the enrich-7.x feature branch, which is backport merge and adds a new ingest processor, named enrich processor, that allows document being ingested to be enriched with data from other indices. Besides a new enrich processor, this PR adds several APIs to manage an enrich policy. An enrich policy is in charge of making the data from other indices available to the enrich processor in an efficient manner. Related to #32789	2019-10-15 17:31:45 +02:00
Benjamin Trent	361e7ad0ef	[ML][Transforms] fix bwc serialization with 7.3 (#48021 ) (#48048 )	2019-10-15 07:52:13 -04:00
David Roberts	83321b0e5e	[ML] Fix isNoop() for datafeed update (#48046 ) max_empty_searches = -1 in a datafeed update implies max_empty_searches will be unset on the datafeed when the update is applied. The isNoop() method needs to take this -1 to null equivalence into account.	2019-10-15 12:28:53 +01:00
David Roberts	984323783e	[ML][7.x] Add lazy assignment job config option (#47993 ) This change adds: - A new option, allow_lazy_open, to anomaly detection jobs - A new option, allow_lazy_start, to data frame analytics jobs Both work in the same way: they allow a job to be opened/started even if no ML node exists that can accommodate the job immediately. In this situation the job waits in the opening/starting state until ML node capacity is available. (The starting state for data frame analytics jobs is new in this change.) Additionally, the ML nightly maintenance tasks now creates audit warnings for ML jobs that are unassigned. This means that jobs that cannot be assigned to an ML node for a very long time will show a yellow warning triangle in the UI. A final change is that it is now possible to close a job that is not assigned to a node without using force. This is because previously jobs that were open but not assigned to a node were an aberration, whereas after this change they'll be relatively common.	2019-10-15 06:55:11 +01:00
Martijn van Groningen	cc4b6c43b3	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-15 07:23:47 +02:00
James Baiera	18d7e32b7d	Add wait for completion for Enrich policy execution (#47886 ) This PR adds the ability to run the enrich policy execution task in the background, returning a task id instead of waiting for the completed operation.	2019-10-14 16:05:28 -04:00
Gordon Brown	699d4d4c6f	Manage retention of partial snapshots in SLM (#47833 ) Currently, partial snapshots will eventually build up unless they are manually deleted. Partial snapshots may be useful if there is not a more recent successful snapshot, but should eventually be deleted if they are no longer useful. With this change, partial snapshots are deleted using the following strategy: PARTIAL snapshots will be kept until the configured expire_after period has passed, if present, and then be deleted. If there is no configured expire_after in the retention policy, then they will be deleted if there is at least one more recent successful snapshot from this policy (as they may otherwise be useful for troubleshooting purposes). Partial snapshots are not counted towards either min_count or max_count.	2019-10-14 10:19:57 -06:00
David Roberts	1ca25bed38	[ML][7.x] Add option to stop datafeed that finds no data (#47995 ) Adds a new datafeed config option, max_empty_searches, that tells a datafeed that has never found any data to stop itself and close its associated job after a certain number of real-time searches have returned no data. Backport of #47922	2019-10-14 17:19:13 +01:00
Martijn van Groningen	d4901a71d7	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-14 10:27:17 +02:00
Ioannis Kakavas	9ee7b3743e	Add FIPS 140 mode to XPack Usage API (#47278 ) (#47976 ) This change adds support for the FIPS 140 mode feature to be retrieved via the XPack Usage API.	2019-10-14 10:40:24 +03:00
Tanguy Leroux	742fa818b8	Add Pause/Resume Auto Follower APIs (#47510 ) (#47904 ) This commit adds two APIs that allow to pause and resume CCR auto-follower patterns: // pause auto-follower POST /_ccr/auto_follow/my_pattern/pause // resume auto-follower POST /_ccr/auto_follow/my_pattern/resume The ability to pause and resume auto-follow patterns can be useful in some situations, including the rolling upgrades of cluster using a bi-directional cross-cluster replication scheme (see #46665). This commit adds a new active flag to the AutoFollowPattern and adapts the AutoCoordinator and AutoFollower classes so that it stops to fetch remote's cluster state when all auto-follow patterns associate to the remote cluster are paused. When an auto-follower is paused, remote indices that match the pattern are just ignored: they are not added to the pattern's followed indices uids list that is maintained in the local cluster state. This way, when the auto-follow pattern is resumed the indices created in the remote cluster in the meantime will be picked up again and added as new following indices. Indices created and then deleted in the remote cluster will be ignored as they won't be seen at all by the auto-follower pattern at resume time. Backport of #47510 for 7.x	2019-10-13 09:22:51 +02:00
Yogesh Gaikwad	ac209c142c	Remove uniqueness constraint for API key name and make it optional (#47549 ) (#47959 ) Since we cannot guarantee the uniqueness of the API key `name` this commit removes the constraint and makes this field optional. Closes #46646	2019-10-12 22:22:16 +11:00
Przemyslaw Gomulka	6ab58de7ef	[7.x] Enable ResolverStyle.STRICT for java formatters backport(#46675 ) (#47913 ) Joda was using ResolverStyle.STRICT when parsing. This means that date will be validated to be a correct year, year-of-month, day-of-month However, we also want to make it works with Year-Of-Era as Joda used to, hence custom temporalquery.localdate in DateFormatters.from Within DateFormatters we use the correct uuuu year instead of yyyy year of era worth noting: if yyyy(without an era) is used in code, the parsing result will be a TemporalAccessor which will fail to be converted into LocalDate. We mostly use DateFormatters.from so this takes care of this. If possible the uuuu format should be used.	2019-10-11 21:19:56 +02:00
Chris Roberson	c57191b163	[Monitoring] Add new cluster privilege now necessary for the stack monitoring ui (#47871 ) (#47915 ) * Add new cluster privilege now necessary for the stack monitoring ui * PR feedback, and add test	2019-10-11 14:54:59 -04:00
James Baiera	73263c654a	Add basic task support for executing enrich policies (#47523 ) Changes the execution logic to create a new task using the execute request, and attaches the new task to the policy runner to be updated. Also, a new response is now returned from the execute api, which contains either the task id of the execution, or the completed status of the run. The fields are mutually exclusive to make it easier to discern what type of response it is.	2019-10-11 13:32:06 -04:00
Przemysław Witek	c62fe8c344	Require that the dependent variable column has at most 2 distinct values in classfication analysis. (#47858 ) (#47906 )	2019-10-11 14:57:08 +02:00
Hendrik Muhs	3da91d5f7a	[Transform] Rename internal indexes for transform plugin (#47788 ) (#47900 ) rename internal indexes of transform plugin - rename audit index and create an alias for accessing it, BWC: add an alias for old indexes to keep them working, kibana UI will switch to use the read alias - rename config index and provide BWC to read from old and new ones	2019-10-11 14:16:17 +02:00
Martijn van Groningen	102016d571	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-10 14:44:05 +02:00
Hendrik Muhs	0e7869128a	[7.5][Transform] introduce new roles and deprecate old ones (#47780 ) (#47819 ) deprecate data_frame_transforms_{user,admin} roles and introduce transform_{user,admin} roles as replacement	2019-10-10 10:31:24 +02:00
Martijn van Groningen	aace42d38d	Add HLRC support for enrich stats API (#47306 ) This PR also includes HLRC docs for the enrich stats api. Relates to #32789	2019-10-10 09:08:29 +02:00
Martijn van Groningen	da1e2ea461	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-09 09:06:13 +02:00
Lee Hinman	fb7abe9fa4	Separate SLM stop/start/status API from ILM (#47710 ) * Separate SLM stop/start/status API from ILM This separates a start/stop/status API for SLM from being tied to ILM's operation mode. These APIs look like: ``` POST /_slm/stop POST /_slm/start GET /_slm/status ``` This allows administrators to have fine-grained control over preventing periodic snapshots and deletions while performing cluster maintenance. Relates to #43663 * Allow going from RUNNING to STOPPED * Align with the OperationMode rules * Fix slmStopping method * Make OperationModeUpdateTask constructor private * Wipe snapshots better in test	2019-10-08 17:21:38 -06:00
Gordon Brown	a492864a9d	Manage retention of failed snapshots in SLM (#47617 ) Failed snapshots will eventually build up unless they are deleted. While failures may not take up much space, they add noise to the list of snapshots and it's desirable to remove them when they are no longer useful. With this change, failed snapshots are deleted using the following strategy: `FAILED` snapshots will be kept until the configured `expire_after` period has passed, if present, and then be deleted. If there is no configured `expire_after` in the retention policy, then they will be deleted if there is at least one more recent successful snapshot from this policy (as they may otherwise be useful for troubleshooting purposes). Failed snapshots are not counted towards either `min_count` or `max_count`.	2019-10-08 17:07:08 -06:00
Dimitris Athanasiou	c1b0bfd74a	[7.x][ML] Unwrap exception causes before calling instanceof (#47676 ) (#47724 ) When exceptions could be returned from another node, the exception might be wrapped in a `RemoteTransportException`. In places where we handled specific exceptions using `instanceof` we ought to unwrap the cause first. This commit attempts to fix this issue after searching code in the ML plugin. Backport of #47676	2019-10-08 16:02:47 +03:00
Benjamin Trent	d33dbf82d4	[7.x] [ML][Inference] adjusting definition object schema and validation (#47447 ) (#47673 ) * [ML][Inference] adjusting definition object schema and validation (#47447) * [ML][Inference] adjusting definition object schema and validation * finalizing schema and fixing inference npe * addressing PR comments * fixing for backport	2019-10-08 07:11:05 -04:00
Hendrik Muhs	5e0e54f455	[Transform] move root endpoint to _transform with BWC layer (#47127 ) (#47682 ) move the main endpoint to /_transform/ from /_data_frame/transforms/ with providing backwards compatibility and deprecation warnings	2019-10-08 08:59:01 +02:00
Tal Levy	a17f394e27	Geo-Match Enrich Processor (#47243 ) (#47701 ) this commit introduces a geo-match enrich processor that looks up a specific `geo_point` field in the enrich-index for all entries that have a geo_shape match field that meets some specific relation criteria with the input field. For example, the enrich index may contain documents with zipcodes and their respective geo_shape. Ingesting documents with a geo_point field can be enriched with which zipcode they associate according to which shape they are contained within. this commit also refactors some of the MatchProcessor by moving a lot of the shared code to AbstractEnrichProcessor. Closes #42639.	2019-10-07 15:03:46 -07:00
Dimitris Athanasiou	7667ea5f6f	[7.x][ML] Additional outlier detection parameters (#47600 ) (#47669 ) Adds the following parameters to `outlier_detection`: - `compute_feature_influence` (boolean): whether to compute or not feature influence scores - `outlier_fraction` (double): the proportion of the data set assumed to be outlying prior to running outlier detection - `standardization_enabled` (boolean): whether to apply standardization to the feature values Backport of #47600	2019-10-07 18:21:33 +03:00
Yogesh Gaikwad	b6d1d2e6ec	Add 'create_doc' index privilege (#45806 ) (#47645 ) Use case: User with `create_doc` index privilege will be allowed to only index new documents either via Index API or Bulk API. There are two cases that we need to think: - User indexing a new document without specifying an Id. For this ES auto generates an Id and now ES version 7.5.0 onwards defaults to `op_type` `create` we just need to authorize on the `op_type`. - User indexing a new document with an Id. This is problematic as we do not know whether a document with Id exists or not. If the `op_type` is `create` then we can assume the user is trying to add a document, if it exists it is going to throw an error from the index engine. Given these both cases, we can safely authorize based on the `op_type` value. If the value is `create` then the user with `create_doc` privilege is authorized to index new documents. In the `AuthorizationService` when authorizing a bulk request, we check the implied action. This code changes that to append the `:op_type/index` or `:op_type/create` to indicate the implied index action.	2019-10-07 23:58:44 +11:00
Yogesh Gaikwad	7c862fe71f	Add support to retrieve all API keys if user has privilege (#47274 ) (#47641 ) This commit adds support to retrieve all API keys if the authenticated user is authorized to do so. This removes the restriction of specifying one of the parameters (like id, name, username and/or realm name) when the `owner` is set to `false`. Closes #46887	2019-10-07 23:58:21 +11:00
Martijn van Groningen	f2f2304c75	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-07 10:07:56 +02:00
Andrei Dan	4506b37ed5	ILM: Skip rolling indexes that are already rolled (#47324 ) (#47592 ) An index with an ILM policy that has a rollover action in one of the phases was rolled over when the ILM conditions dictated regardless if it was already rolled over (eg. manually after modifying an index template in order to force the creation of a new index that uses the new mappings). This changes this behaviour and has ILM check if the index it's about to roll has not been rolled over in the meantime. (cherry picked from commit 37d6106feeb9f9369519117c88a9e7e30f3ac797) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-10-07 07:47:47 +01:00
Lee Hinman	79376b7219	Set default SLM retention invocation time (#47604 ) This adds a default for the `slm.retention_schedule` setting, setting it to `0 30 1 * * ?` which is 1:30am every day. Having retention unset meant that it would never be invoked and clean up snapshots. We determined it would be better to have a default than never to be run. When coming to a decision, we weighed the option of an absolute time (such as 1:30am) versus a periodic invocation (like every 12 hours). In the end we decided on the absolute time because it has better predictability and consistency than a periodic invocation, which would rely on when the master node were elected or restarted. Relates to #43663	2019-10-04 15:00:20 -06:00
Przemysław Witek	ee952da2e2	[7.x] Implement evaluation API for multiclass classification problem (#47126 ) (#47343 )	2019-10-04 17:54:51 +02:00
Przemysław Witek	8c180a77f0	[7.x] Fix serialization of evaluation response. (#47557 ) (#47566 )	2019-10-04 15:12:18 +02:00
Przemysław Witek	ec9b77deaa	[7.x] Implement new analysis type: classification (#46537 ) (#47559 )	2019-10-04 13:47:19 +02:00
David Roberts	31a5e1c7ee	[ML] More accurate job memory overhead (#47516 ) When an ML job runs the memory required can be broken down into: 1. Memory required to load the executable code 2. Instrumented model memory 3. Other memory used by the job's main process or ancilliary processes that is not instrumented Previously we added a simple fixed overhead to account for 1 and 3. This was 100MB for anomaly detection jobs (large because of the completely uninstrumented categorization function and normalize process), and 20MB for data frame analytics jobs. However, this was an oversimplification because the executable code only needs to be loaded once per machine. Also the 100MB overhead for anomaly detection jobs was probably too high in most cases because categorization and normalization don't use _that_ much memory. This PR therefore changes the calculation of memory requirements as follows: 1. A per-node overhead of 30MB for _only_ the first job of any type to be run on a given node - this is to account for loading the executable code 2. The established model memory (if applicable) or model memory limit of the job 3. A per-job overhead of 10MB for anomaly detection jobs and 5MB for data frame analytics jobs, to account for the uninstrumented memory usage This change will enable more jobs to be run on the same node. It will be particularly beneficial when there are a large number of small jobs. It will have less of an effect when there are a small number of large jobs.	2019-10-04 09:57:31 +01:00
Alpar Torok	0a14bb174f	Remove eclipse conditionals (#44075 ) * Remove eclipse conditionals We used to have some meta projects with a `-test` prefix because historically eclipse could not distinguish between test and main source-sets and could only use a single classpath. This is no longer the case for the past few Eclipse versions. This PR adds the necessary configuration to correctly categorize source folders and libraries. With this change eclipse can import projects, and the visibility rules are correct e.x. auto compete doesn't offer classes from test code or `testCompile` dependencies when editing classes in `main`. Unfortunately the cyclic dependency detection in Eclipse doesn't seem to take the difference between test and non test source sets into account, but since we are checking this in Gradle anyhow, it's safe to set to `warning` in the settings. Unfortunately there is no setting to ignore it. This might cause problems when building since Eclipse will probably not know the right order to build things in so more wirk might be necesarry.	2019-10-03 11:55:00 +03:00
Lee Hinman	2e3eb4b24e	Add API to execute SLM retention on-demand (#47405 ) (#47463 ) * Add API to execute SLM retention on-demand (#47405) This is a backport of #47405 This commit adds the `/_slm/_execute_retention` API endpoint. This endpoint kicks off SLM retention and then returns immediately. This in particular allows us to run retention without scheduling it (for entirely manual invocation) or perform a one-off cleanup. This commit also includes HLRC for the new API, and fixes an issue in SLMSnapshotBlockingIntegTests where retention invoked prior to the test completing could resurrect an index the internal test cluster cleanup had already deleted. Resolves #46508 Relates to #43663	2019-10-02 12:29:04 -06:00
Lee Hinman	013d87d716	Fix AllocationRoutedStepTests.testConditionMetOnlyOneCopyAlloc… (#47313 ) * Fix AllocationRoutedStepTests.testConditionMetOnlyOneCopyAllocated These tests were using randomly generated includes/excludes/requires for routing, however, it was possible to generate mutually exclusive allocation settings (about 1 out of 50,000 times for my runs). This splits the test into three different tests, and removes the randomization (it doesn't add anything to the testing here) to fix the issue. Resolves #47142	2019-10-02 10:01:23 -06:00
Benjamin Trent	2228a7dd8d	[ML][Inference] adding ensemble model objects (#47241 ) (#47438 ) * [ML][Inference] adding ensemble model objects * addressing PR comments * Update TreeTests.java * addressing PR comments * fixing test	2019-10-02 09:49:46 -04:00
Dimitris Athanasiou	b9541eb3af	[7.x][ML] Make PUT data frame analytics action a master node action (… (#47433 ) While it seemed like the PUT data frame analytics action did not have to be a master node action as the config is stored in an index rather than the cluster state, there are other subtle nuances which make it worthwhile to convert it. In particular, it helps maintain order of execution for put actions which are anyhow user driven and are expected to have low volume. This commit converts `TransportPutDataFrameAnalyticsAction` from a handled transport action to a master node action. Note this means that the action might fail in a mixed cluster but as the API is still experimental and not widely used there will be few moments more suitable to make this change than now.	2019-10-02 16:24:21 +03:00
Yannick Welsch	7b2613db55	Allow optype CREATE for append-only indexing operations (#47169 ) Bulk requests currently do not allow adding "create" actions with auto-generated IDs. This commit allows using the optype CREATE for append-only indexing operations. This is mainly the user facing aspect of it.	2019-10-02 14:16:52 +02:00
David Roberts	4379a3c52b	[ML] Throttle the delete-by-query of expired results (#47177 ) Due to #47003 many clusters will have built up a large backlog of expired results. On upgrading to a version where that bug is fixed users could find that the first ML daily maintenance task deletes a very large amount of documents. This change introduces throttling to the delete-by-query that the ML daily maintenance uses to delete expired results to limit it to deleting an average 200 documents per second. (There is no throttling for state/forecast documents as these are expected to be lower volume.) Additionally a rough time limit of 8 hours is applied to the whole delete expired data action. (This is only rough as it won't stop part way through a single operation - it only checks the timeout between operations.) Relates #47103	2019-10-02 11:16:34 +01:00
Dimitris Athanasiou	36884a3c32	[7.x][ML] Restore analytics state if available (#47128 ) (#47393 ) This commit restores the model state if available in data frame analytics jobs. In addition, this changes the start API so that a stopped job can be restarted. As we now store the progress in the state index when the task is stopped, we can use it to determine what state the job was in when it got stopped. Note that in order to be able to distinguish between a job that runs for the first time and another that is restarting, we ensure reindexing progress is reported to be at least 1 for a running task.	2019-10-02 10:24:05 +03:00
Benjamin Trent	f5fe5e7cd6	[7.x] [ML][Inference] Adding preprocessors to definition object (#47320 ) (#47370 ) * [ML][Inference] Adding preprocessors to definition object (#47320) * [ML][Inference] Adding preprocessors to definition object * Update TrainedModelConfig.java * adjusting for backport	2019-10-01 13:31:25 -04:00
Albert Zaharovits	78558a7b2f	Fix AD realm additional metadata (#47179 ) Due to a regression bug the metadata Active Directory realm setting is ignored (it works correctly for the LDAP realm type). This commit redresses it. Closes #45848	2019-10-01 17:05:25 +03:00
Benjamin Trent	4335e07716	[7.x] [ML][Inference] adding .ml-inference* index and storage (#47267 ) (#47310 ) * [ML][Inference] adding .ml-inference* index and storage (#47267) * [ML][Inference] adding .ml-inference* index and storage * Addressing PR comments * Allowing null definition, adding validation tests for model config * fixing line length * adjusting for backport	2019-10-01 08:20:33 -04:00
Armin Braun	3d23cb44a3	Speed up Snapshot Finalization (#47283 ) (#47309 ) As a result of #45689 snapshot finalization started to take significantly longer than before. This may be a little unfortunate since it increases the likelihood of failing to finalize after having written out all the segment blobs. This change parallelizes all the metadata writes that can safely run in parallel in the finalization step to speed the finalization step up again. Also, this will generally speed up the snapshot process overall in case of large number of indices. This is also a nice to have for #46250 since we add yet another step (deleting of old index- blobs in the shards to the finalization.	2019-09-30 23:28:59 +02:00
Martijn van Groningen	fe937ea4b8	Add config namespace in get policy api response (#47162 ) Currently the policy config is placed directly in the json object of the toplevel `policies` array field. For example: ``` { "policies": [ { "match": { "name" : "my-policy", "indices" : ["users"], "match_field" : "email", "enrich_fields" : [ "first_name", "last_name", "city", "zip", "state" ] } } ] } ``` This change adds a `config` field in each policy json object: ``` { "policies": [ { "config": { "match": { "name" : "my-policy", "indices" : ["users"], "match_field" : "email", "enrich_fields" : [ "first_name", "last_name", "city", "zip", "state" ] } } } ] } ``` This allows us in the future to add other information about policies in the get policy api response. The UI will consume this API to build an overview of all policies. The UI may in the future include additional information about a policy and the plan is to include that in the get policy api, so that this information can be gathered in a single api call. An example of the information that is likely to be added is: * Last policy execution time * The status of a policy (executing, executed, unexecuted) * Information about the last failure if exists	2019-09-30 14:37:23 +02:00
Jason Tedor	2cba323b4e	Remove use of get raw in token/API key settings (#47260 ) These settings were using get raw to fallback to whether or not SSL is enabled. Yet, we have a formal mechanism for falling back to a setting. This commit cuts over to that formal mechanism.	2019-09-30 06:35:58 -04:00
Martijn van Groningen	66f72bcdbc	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-30 08:12:28 +02:00
Rory Hunter	53a4d2176f	Convert most awaitBusy calls to assertBusy (#45794 ) (#47112 ) Backport of #45794 to 7.x. Convert most `awaitBusy` calls to `assertBusy`, and use asserts where possible. Follows on from #28548 by @liketic. There were a small number of places where it didn't make sense to me to call `assertBusy`, so I kept the existing calls but renamed the method to `waitUntil`. This was partly to better reflect its usage, and partly so that anyone trying to add a new call to awaitBusy wouldn't be able to find it. I also didn't change the usage in `TransportStopRollupAction` as the comments state that the local awaitBusy method is a temporary copy-and-paste. Other changes: * Rework `waitForDocs` to scale its timeout. Instead of calling `assertBusy` in a loop, work out a reasonable overall timeout and await just once. * Some tests failed after switching to `assertBusy` and had to be fixed. * Correct the expect templates in AbstractUpgradeTestCase. The ES Security team confirmed that they don't use templates any more, so remove this from the expected templates. Also rewrite how the setup code checks for templates, in order to give more information. * Remove an expected ML template from XPackRestTestConstants The ML team advised that the ML tests shouldn't be waiting for any `.ml-notifications` templates, since such checks should happen in the production code instead. Also rework the template checking code in `XPackRestTestHelper` to give more helpful failure messages. * Fix issue in `DataFrameSurvivesUpgradeIT` when upgrading from < 7.4	2019-09-29 12:21:46 +01:00
Martijn van Groningen	7ffe2e7e63	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-27 14:42:11 +02:00
Andrei Dan	4c909438dd	Fix OriginationDate parsing tests. (#47170 ) (#47200 ) Drop the usage of `SimpleDateFormat` and use the `DateFormatter` instead (cherry picked from commit 7cf509a7a11ecf6c40c44c18e8f03b8e81fcd1c2) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-09-27 13:16:45 +01:00
Przemysław Witek	3fbd58d156	[7.x] Allow evaluation to consist of multiple steps. (#46653 ) (#47194 )	2019-09-27 13:01:51 +02:00
Martijn van Groningen	8a4eefdd83	Expose enrich stats api to monitoring. (#46708 ) This change also slightly modifies the stats response, so that is can easier consumer by monitoring and other users. (coordinators stats are now in a list instead of a map and has an additional field for the node id) Relates to #32789	2019-09-26 11:04:33 +02:00
Yogesh Gaikwad	9a64b7a888	[Backport] Validate `query` field when creating roles (#46275 ) (#47094 ) In the current implementation, the validation of the role query occurs at runtime when the query is being executed. This commit adds validation for the role query when creating a role but not for the template query as we do not have the runtime information required for evaluating the template query (eg. authenticated user's information). This is similar to the scripts that we store but do not evaluate or parse if they are valid queries or not. For validation, the query is evaluated (if not a template), parsed to build the QueryBuilder and verify if the query type is allowed. Closes #34252	2019-09-26 17:57:36 +10:00
Benjamin Trent	fcddaa90de	[7.x] [ML][Inference] adding tree model (#47044 ) (#47141 ) * [ML][Inference] adding tree model (#47044) * [ML][Inference] adding tree model * renaming features for updated schema * fixing 7.x compilation	2019-09-25 19:11:15 -04:00
Andrei Dan	27520cac3b	ILM: parse origination date from index name (#46755 ) (#47124 ) * ILM: parse origination date from index name (#46755) Introduce the `index.lifecycle.parse_origination_date` setting that indicates if the origination date should be parsed from the index name. If set to true an index which doesn't match the expected format (namely `indexName-{dateFormat}-optional_digits` will fail before being created. The origination date will be parsed when initialising a lifecycle for an index and it will be set as the `index.lifecycle.origination_date` for that index. A user set value for `index.lifecycle.origination_date` will always override a possible parsable date from the index name. (cherry picked from commit c363d27f0210733dad0c307d54fa224a92ddb569) Signed-off-by: Andrei Dan <andrei.dan@elastic.co> * Drop usage of Map.of to be java 8 compliant	2019-09-25 21:44:16 +01:00
Lee Hinman	a267df30fa	Wait for snapshot completion in SLM snapshot invocation (#47051 ) * Wait for snapshot completion in SLM snapshot invocation This changes the snapshots internally invoked by SLM to wait for completion. This allows us to capture more snapshotting failure scenarios. For example, previously a snapshot would be created and then registered as a "success", however, the snapshot may have been aborted, or it may have had a subset of its shards fail. These cases are now handled by inspecting the response to the `CreateSnapshotRequest` and ensuring that there are no failures. If any failures are present, the history store now stores the action as a failure instead of a success. Relates to #38461 and #43663	2019-09-25 14:25:22 -06:00
Gordon Brown	a46eef9634	Change SLM stats format (#46991 ) Using arrays of objects with embedded IDs is preferred for new APIs over using entity IDs as JSON keys. This commit changes the SLM stats API to use the preferred format.	2019-09-25 11:32:08 -06:00
Benjamin Trent	05fb7be571	[7.x] [ML][Inference] Feature pre-processing objects and functions (#46777 ) (#47040 ) * [ML][Inference] Feature pre-processing objects and functions (#46777) To support inference on pre-trained machine learning models, some basic feature encoding will be necessary. I am using a named object serialization approach so new encodings/pre-processing steps could be added in the future. This PR lays down the ground work for 3 basic encodings: * HotOne * Target Mean * Frequency More feature encodings or pre-processings could be added in the future: * Handling missing columns * Standardization * Label encoding * etc.... * fixing compilation for namedxcontent tests	2019-09-25 08:16:24 -04:00
Ioannis Kakavas	23bceaadf8	Handle RelayState in preparing a SAMLAuthN Request (#46534 ) (#47092 ) This change allows for the caller of the `saml/prepare` API to pass a `relay_state` parameter that will then be part of the redirect URL in the response as the `RelayState` query parameter. The SAML IdP is required to reflect back the value of that relay state when sending a SAML Response. The caller of the APIs can then, when receiving the SAML Response, read and consume the value as it see fit.	2019-09-25 13:23:46 +03:00
Yogesh Gaikwad	6f453aa6b2	Validate index and cluster privilege names when creating a role (#46361 ) (#47063 ) This commit adds validation so a role cannot be created with invalid index or cluster privilege name. Closes #29703	2019-09-25 18:57:11 +10:00
Hendrik Muhs	7377ac4637	[Transform] Replace transforms with transform, index constants (#47023 ) - replace "transforms" with "transform" for consistency - use constants for internal index naming wherever possible and document required changes	2019-09-25 08:31:43 +02:00
James Baiera	a349b22273	Add the cluster version to enrich policies (#45021 ) Adds the Elasticsearch version as a field on the EnrichPolicy object	2019-09-23 18:44:45 -04:00
Julie Tibshirani	9124c94a6c	Add support for aliases in queries on _index. (#46944 ) Previously, queries on the _index field were not able to specify index aliases. This was a regression in functionality compared to the 'indices' query that was deprecated and removed in 6.0. Now queries on _index can specify an alias, which is resolved to the concrete index names when we check whether an index matches. To match a remote shard target, the pattern needs to be of the form 'cluster:index' to match the fully-qualified index name. Index aliases can be specified in the following query types: term, terms, prefix, and wildcard.	2019-09-23 13:21:37 -07:00
Martijn van Groningen	0cfddca61d	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-23 09:46:05 +02:00
Michael Basnight	f1c7ed647b	Allow comma separated ids in get enrich policy API (#46351 ) This commit changes the GET REST api so it will accept an optional comma separated list of enrich policy ids. This change also modifies the behavior of the GET API in that it will not error if it is passed a bad enrich id anymore, but will instead just return an empty list.	2019-09-20 10:06:58 -05:00
Hendrik Muhs	4a2cb05162	add message about transform disabled if license is missing (#46901 ) adds a message for transform about what happens if no license has been activated	2019-09-20 13:47:40 +02:00
Hendrik Muhs	abe889af75	[7.5][Transform] rename classes in transform plugin (#46867 ) rename classes and settings in transform plugin, provide BWC for old settings	2019-09-20 10:43:00 +02:00
Dimitris Athanasiou	02a5e153dc	[7.x][ML] Parse and index data frame analytics state (#46804 ) (#46820 ) This commit reuses the same state processor that is used for autodetect to parse state output from data frame analytics jobs. We then index the state document into the state index. Backport of #46804	2019-09-18 20:37:40 +03:00
Benjamin Trent	9cf9c64ec2	[7.x] [ML][Transforms] remove `force` flag from _start (#46414 ) (#46748 ) * [ML][Transforms] remove `force` flag from _start (#46414) * [ML][Transforms] remove `force` flag from _start * fixing expected error message * adjusting bwc version	2019-09-18 10:06:05 -04:00
Lee Hinman	b85468d6ea	Add node setting for disabling SLM (#46794 ) (#46796 ) This adds the `xpack.slm.enabled` setting to allow disabling of SLM functionality as well as its HTTP API endpoints. Relates to #38461	2019-09-17 17:39:41 -06:00
Oliver Gupte	cbd58d3b78	Give kibana user privileges to create APM agent config index (#46765 ) (#46792 ) * Give kibana user reserved role privileges on .apm-* to create APM agent configuration index. * fixed test to include checking all .apm-* permissions * changed pattern from ".apm-*" to the more specific ".apm-agent-configuration"	2019-09-17 15:01:42 -07:00
Armin Braun	b0f09b279f	Make Snapshot Logic Write Metadata after Segments (#45689 ) (#46764 ) * Write metadata during snapshot finalization after segment files to prevent outdated metadata in case of dynamic mapping updates as explained in #41581 * Keep the old behavior of writing the metadata beforehand in the case of mixed version clusters for BwC reasons * Still overwrite the metadata in the end, so even a mixed version cluster is fixed by this change if a newer version master does the finalization * Fixes #41581	2019-09-17 13:09:39 +02:00
Przemysław Witek	e49be611ad	[7.x] Add audit messages for Data Frame Analytics (#46521 ) (#46738 )	2019-09-16 21:21:38 +02:00
Hendrik Muhs	c8f52ec4ff	[Transform] Rename data frame plugin to transform: classes in xpack.core (#46644 ) (#46734 ) rename classes in xpack.core of transform plugin from "data frame transform" to "transform"	2019-09-16 13:39:22 +02:00
Andrei Dan	c57cca98b2	[ILM] Add date setting to calculate index age (#46561 ) (#46697 ) * [ILM] Add date setting to calculate index age Add the `index.lifecycle.origination_date` to allow users to configure a custom date that'll be used to calculate the index age for the phase transmissions (as opposed to the default index creation date). This could be useful for users to create an index with an "older" origination date when indexing old data. Relates to #42449. * [ILM] Don't override creation date on policy init The initial approach we took was to override the lifecycle creation date if the `index.lifecycle.origination_date` setting was set. This had the disadvantage of the user not being able to update the `origination_date` anymore once set. This commit changes the way we makes use of the `index.lifecycle.origination_date` setting by checking its value when we calculate the index age (ie. at "read time") and, in case it's not set, default to the index creation date. * Make origination date setting index scope dynamic * Document orignation date setting in ilm settings (cherry picked from commit d5bd2bb77ee28c1978ab6679f941d7c02e389d32) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-09-16 08:50:28 +01:00
Luca Cavanna	e57756492a	Update http-core and http-client dependencies (#46549 ) Relates to #45808 Closes #45577	2019-09-12 09:45:29 +02:00
James Rodewig	f9bf10f2b6	[DOCS] Change "a SSL" to "an SSL" in the Java docs (#46524 ) (#46618 )	2019-09-11 15:55:57 -04:00
Lee Hinman	09a9cefaa0	Handle partial failure retrieving segments in SegmentCountStep (#46556 ) Since the `IndicesSegmentsRequest` scatters to all shards for the index, it's possible that some of the shards may fail. This adds failure handling and logging (since this is a best-effort step in the first place) for this case.	2019-09-11 10:29:31 -06:00
Jim Ferenczi	23bf310c84	Replace the SearchContext with QueryShardContext when building aggregator factories (#46527 ) This commit replaces the `SearchContext` with the `QueryShardContext` when building aggregator factories. Aggregator factories are part of the `SearchContext` so they shouldn't require a `SearchContext` to create them. The main changes here are the signatures of `AggregationBuilder#build` that now takes a `QueryShardContext` and `AggregatorFactory#createInternal` that passes the `SearchContext` to build the `Aggregator`. Relates #46523	2019-09-11 16:43:30 +02:00
Hendrik Muhs	efea581dcc	[7.x][Transform]Rename data frame plugin to transform: plugin and package names (#46583 ) rename data frame transform plugin to transform: - rename plugin data-frame to transform - change all package names from o.e..dataframe. to o.e..transform. - necessary changes to fix loading/testing	2019-09-11 14:50:08 +02:00
Armin Braun	41633cb9b5	More Efficient Ordering of Shard Upload Execution (#42791 ) (#46588 ) * More Efficient Ordering of Shard Upload Execution (#42791) * Change the upload order of of snapshots to work file by file in parallel on the snapshot pool instead of merely shard-by-shard * Inspired by #39657 * Cleanup BlobStoreRepository Abort and Failure Handling (#46208)	2019-09-11 13:59:20 +02:00
Martijn van Groningen	a4b0f66919	Add enrich stats api (#46462 ) The enrich api returns enrich coordinator stats and information about currently executing enrich policies. The coordinator stats include per ingest node: * The current number of search requests in the queue. * The total number of outstanding remote requests that have been executed since node startup. Each remote request is likely to include multiple search requests. This depends on how much search requests are in the queue at the time when the remote request is performed. * The number of current outstanding remote requests. * The total number of search requests that `enrich` processors have executed since node startup. The current execution policies stats include: * The name of policy that is executing * A full blow task info object that is executing the policy. Relates to #32789	2019-09-11 13:40:24 +02:00
Jim Ferenczi	425b1a77e8	Add more context to QueryShardContext (#46584 ) This change adds an IndexSearcher and the node's BigArrays in the QueryShardContext. It's a spin off of #46527 as this change is required to allow aggregation builder to solely use the query shard context. Relates #46523	2019-09-11 12:24:51 +02:00
Przemysław Witek	e38e631dac	[7.x] Implement DataFrameAnalyticsAuditMessage and DataFrameAnalyticsAuditor (#45967 ) (#46519 )	2019-09-11 12:17:26 +02:00
Lee Hinman	cdc3a260af	Add retention to Snapshot Lifecycle Management (backport of #4… (#46506 ) * Add retention to Snapshot Lifecycle Management (#46407) This commit adds retention to the existing Snapshot Lifecycle Management feature (#38461) as described in #43663. This allows a user to configure SLM to automatically delete older snapshots based on a number of criteria. An example policy would look like: ``` PUT /_slm/policy/snapshot-every-day { "schedule": "0 30 2 * * ?", "name": "<production-snap-{now/d}>", "repository": "my-s3-repository", "config": { "indices": ["foo-", "important"] }, // Newly configured retention options "retention": { // Snapshots should be deleted after 14 days "expire_after": "14d", // Keep a maximum of thirty snapshots "max_count": 30, // Keep a minimum of the four most recent snapshots "min_count": 4 } } ``` SLM Retention is run on a scheduled configurable with the `slm.retention_schedule` setting, which supports cron expressions. Deletions are run for a configurable time bounded by the `slm.retention_duration` setting, which defaults to 1 hour. Included in this work is a new SLM stats API endpoint available through ``` json GET /_slm/stats ``` That returns statistics about snapshot taken and deleted, as well as successful retention runs, failures, and the time spent deleting snapshots. #45362 has more information as well as an example of the output. These stats are also included when retrieving SLM policies via the API. Add base framework for snapshot retention (#43605) * Add base framework for snapshot retention This adds a basic `SnapshotRetentionService` and `SnapshotRetentionTask` to start as the basis for SLM's retention implementation. Relates to #38461 * Remove extraneous 'public' * Use a local var instead of reading class var repeatedly * Add SnapshotRetentionConfiguration for retention configuration (#43777) * Add SnapshotRetentionConfiguration for retention configuration This commit adds the `SnapshotRetentionConfiguration` class and its HLRC counterpart to encapsulate the configuration for SLM retention. Currently only a single parameter is supported as an example (we still need to discuss the different options we want to support and their names) to keep the size of the PR down. It also does not yet include version serialization checks since the original SLM branch has not yet been merged. Relates to #43663 * Fix REST tests * Fix more documentation * Use Objects.equals to avoid NPE * Put `randomSnapshotLifecyclePolicy` in only one place * Occasionally return retention with no configuration * Implement SnapshotRetentionTask's snapshot filtering and delet… (#44764) * Implement SnapshotRetentionTask's snapshot filtering and deletion This commit implements the snapshot filtering and deletion for `SnapshotRetentionTask`. Currently only the expire-after age is used for determining whether a snapshot is eligible for deletion. Relates to #43663 * Fix deletes running on the wrong thread * Handle missing or null policy in snap metadata differently * Convert Tuple<String, List<SnapshotInfo>> to Map<String, List<SnapshotInfo>> * Use the `OriginSettingClient` to work with security, enhance logging * Prevent NPE in test by mocking Client * Allow empty/missing SLM retention configuration (#45018) Semi-related to #44465, this allows the `"retention"` configuration map to be missing. Relates to #43663 * Add min_count and max_count as SLM retention predicates (#44926) This adds the configuration options for `min_count` and `max_count` as well as the logic for determining whether a snapshot meets this criteria to SLM's retention feature. These options are optional and one, two, or all three can be specified in an SLM policy. Relates to #43663 * Time-bound deletion of snapshots in retention delete function (#45065) * Time-bound deletion of snapshots in retention delete function With a cluster that has a large number of snapshots, it's possible that snapshot deletion can take a very long time (especially since deletes currently have to happen in a serial fashion). To prevent snapshot deletion from taking forever in a cluster and blocking other operations, this commit adds a setting to allow configuring a maximum time to spend deletion snapshots during retention. This dynamic setting defaults to 1 hour and is best-effort, meaning that it doesn't hard stop a deletion at an hour mark, but ensures that once the time has passed, all subsequent deletions are deferred until the next retention cycle. Relates to #43663 * Wow snapshots suuuure can take a long time. * Use a LongSupplier instead of actually sleeping * Remove TestLogging annotation * Remove rate limiting * Add SLM metrics gathering and endpoint (#45362) * Add SLM metrics gathering and endpoint This commit adds the infrastructure to gather metrics about the different SLM actions that a cluster takes. These actions are stored in `SnapshotLifecycleStats` and perpetuated in cluster state. The stats stored include the number of snapshots taken, failed, deleted, the number of retention runs, as well as per-policy counts for snapshots taken, failed, and deleted. It also includes the amount of time spent deleting snapshots from SLM retention. This commit also adds an endpoint for retrieving all stats (further commits will expose this in the SLM get-policy API) that looks like: ``` GET /_slm/stats { "retention_runs" : 13, "retention_failed" : 0, "retention_timed_out" : 0, "retention_deletion_time" : "1.4s", "retention_deletion_time_millis" : 1404, "policy_metrics" : { "daily-snapshots2" : { "snapshots_taken" : 7, "snapshots_failed" : 0, "snapshots_deleted" : 6, "snapshot_deletion_failures" : 0 }, "daily-snapshots" : { "snapshots_taken" : 12, "snapshots_failed" : 0, "snapshots_deleted" : 12, "snapshot_deletion_failures" : 6 } }, "total_snapshots_taken" : 19, "total_snapshots_failed" : 0, "total_snapshots_deleted" : 18, "total_snapshot_deletion_failures" : 6 } ``` This does not yet include HLRC for this, as this commit is quite large on its own. That will be added in a subsequent commit. Relates to #43663 * Version qualify serialization * Initialize counters outside constructor * Use computeIfAbsent instead of being too verbose * Move part of XContent generation into subclass * Fix REST action for master merge * Unused import * Record history of SLM retention actions (#45513) This commit records the deletion of snapshots by the retention component of SLM into the SLM history index for the purposes of reviewing operations taken by SLM and alerting. * Retry SLM retention after currently running snapshot completes (#45802) * Retry SLM retention after currently running snapshot completes This commit adds a ClusterStateObserver to wait until the currently running snapshot is complete before proceeding with snapshot deletion. SLM retention waits for the maximum allowed deletion time for the snapshot to complete, however, the waiting time is not factored into the limit on actual deletions. Relates to #43663 * Increase timeout waiting for snapshot completion * Apply patch From `2374316f0d`.patch * Rename test variables * [TEST] Be less strict for stats checking * Skip SLM retention if ILM is STOPPING or STOPPED (#45869) This adds a check to ensure we take no action during SLM retention if ILM is currently stopped or in the process of stopping. Relates to #43663 * Check all actions preventing snapshot delete during retention (#45992) * Check all actions preventing snapshot delete during retention run Previously we only checked to see if a snapshot was currently running, but it turns out that more things can block snapshot deletion. This changes the check to be a check for: - a snapshot currently running - a deletion already in progress - a repo cleanup in progress - a restore currently running This was found by CI where a third party delete in a test caused SLM retention deletion to throw an exception. Relates to #43663 * Add unit test for okayToDeleteSnapshots * Fix bug where SLM retention task would be scheduled on every node * Enhance test logging * Ignore if snapshot is already deleted * Missing import * Fix SnapshotRetentionServiceTests * Expose SLM policy stats in get SLM policy API (#45989) This also adds support for the SLM stats endpoint to the high level rest client. Retrieving a policy now looks like: ```json { "daily-snapshots" : { "version": 1, "modified_date": "2019-04-23T01:30:00.000Z", "modified_date_millis": 1556048137314, "policy" : { "schedule": "0 30 1 * * ?", "name": "<daily-snap-{now/d}>", "repository": "my_repository", "config": { "indices": ["data-", "important"], "ignore_unavailable": false, "include_global_state": false }, "retention": {} }, "stats": { "snapshots_taken": 0, "snapshots_failed": 0, "snapshots_deleted": 0, "snapshot_deletion_failures": 0 }, "next_execution": "2019-04-24T01:30:00.000Z", "next_execution_millis": 1556048160000 } } ``` Relates to #43663 Rewrite SnapshotLifecycleIT as as ESIntegTestCase (#46356) * Rewrite SnapshotLifecycleIT as as ESIntegTestCase This commit splits `SnapshotLifecycleIT` into two different tests. `SnapshotLifecycleRestIT` which includes the tests that do not require slow repositories, and `SLMSnapshotBlockingIntegTests` which is now an integration test using `MockRepository` to simulate a snapshot being in progress. Relates to #43663 Resolves #46205 * Add error logging when exceptions are thrown * Update serialization versions * Fix type inference * Use non-Cancellable HLRC return value * Fix Client mocking in test * Fix SLMSnapshotBlockingIntegTests for 7.x branch * Update SnapshotRetentionTask for non-multi-repo snapshot retrieval * Add serialization guards for SnapshotLifecyclePolicy	2019-09-10 09:08:09 -06:00
Michael Basnight	9304f5c889	Ensure enrich executes on master node only (#46448 ) The previous transport action was a read action, which under the right set of circumstances can execute on a coordinating node. This commit ensures that cannot happen.	2019-09-10 09:59:36 -05:00
Martijn van Groningen	c057fce978	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-09 08:40:54 +02:00
Benjamin Trent	457ff3e2fb	7.x/ml fix instance serialization bwc (#46404 ) * [ML] Fixing instance serialization version for bwc * fixing CppLogMessage	2019-09-05 13:23:26 -05:00
Martijn van Groningen	ded98e50b7	Change exact match processor to match processor. (#46041 ) Besides a rename, this changes allows to processor to attach multiple enrich docs to the document being ingested. Also in order to control the maximum number of enrich docs to be included in the document being ingested, the `max_matches` setting is added to the enrich processor. Relates #32789	2019-09-04 18:05:12 +02:00
Andrey Ershov	ece9eb4acd	Remove stack trace logging in Security(Transport\|Http)ExceptionHandler (#45966 ) As per #45852 comment we no longer need to log stack-traces in SecurityTransportExceptionHandler and SecurityHttpExceptionHandler even if trace logging is enabled. (cherry picked from commit c99224a32d26db985053b7b36e2049036e438f97)	2019-09-04 11:50:35 +03:00
Benjamin Trent	53df54c703	[ML][Transforms] fixing stop on changes check bug (#46162 ) (#46273 ) * [ML][Transforms] fixing stop on changes check bug * Adding new method finishAndCheckState to cover race conditions in early terminations * changing stopping conditions in `onStart` * allow indexer to finish when exiting early	2019-09-03 11:04:18 -05:00
Lee Hinman	3d4b8e01c7	Validate SLM policy ids strictly (#45998 ) (#46145 ) This uses strict validation for SLM policy ids, similar to what we use for index names. Resolves #45997	2019-09-03 09:20:02 -06:00
David Roberts	ab045744ac	[ML-DataFrame] Fix off-by-one error in checkpoint operations_behind (#46235 ) Fixes a problem where operations_behind would be one less than expected per shard in a new index matched by the data frame transform source pattern. For example, if a data frame transform had a source of foo* and a new index foo-new was created with 2 shards and 7 documents indexed in it then operations_behind would be 5 prior to this change. The problem was that an empty index has a global checkpoint number of -1 and the sequence number of the first document that is indexed into an index is 0, not 1. This doesn't matter for indices included in both the last and next checkpoints, as the off-by-one errors cancelled, but for a new index it affected the observed result.	2019-09-03 12:45:02 +01:00
Martijn van Groningen	555b630160	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-02 09:16:55 +02:00
Benjamin Trent	d0c5573a51	[ML] Throw an error when a datafeed needs CCS but it is not enabled for the node (#46044 ) (#46096 ) Though we allow CCS within datafeeds, users could prevent nodes from accessing remote clusters. This can cause mysterious errors and difficult to troubleshoot. This commit adds a check to verify that `cluster.remote.connect` is enabled on the current node when a datafeed is configured with a remote index pattern.	2019-08-30 09:27:07 -05:00
Dimitris Athanasiou	5921ae53d8	[7.x][ML] Regression dependent variable must be numeric (#46072 ) (#46136 ) * [ML] Regression dependent variable must be numeric This adds a validation that the dependent variable of a regression analysis must be numeric. * Address review comments and fix some problems In addition to addressing the review comments, this commit fixes a few issues I found during testing. In particular: - if there were mappings for required fields but they were not included we were not reporting the error - if explicitly included fields had unsupported types we were not reporting the error Unfortunately, I couldn't get those fixed without refactoring the code in `ExtractedFieldsDetector`.	2019-08-30 09:57:43 +03:00
Zachary Tong	cf8a4171e1	Rename `data-science` plugin to `analytics` (#46133 ) Rename `data-science` plugin to `analytics`. Also removes enabled flag. Backport of #46092	2019-08-29 12:45:39 -04:00
Michael Basnight	51a703da29	Add enrich transport client support (#46002 ) This commit adds an enrich client, as well as a smoke test to validate the client works.	2019-08-29 09:10:07 -05:00
Przemysław Witek	b8a0379057	Refactor auditor-related classes (#45893 ) (#46120 )	2019-08-29 14:21:03 +02:00
Gordon Brown	47bbd9d9a9	[7.x] Fix rollover alias in SLM history index template (#46001 ) This commit adds the `rollover_alias` setting required for ILM to work correctly to the SLM history index template and adds assertions to the SLM integration tests to ensure that it works correctly.	2019-08-28 14:50:22 -07:00
Martijn van Groningen	1157224a6b	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-28 10:14:07 +02:00
Jake Landis	154d1dd962	Watcher max_iterations with foreach action execution (#45715 ) (#46039 ) Prior to this commit the foreach action execution had a hard coded limit to 100 iterations. This commit allows the max number of iterations to be a configuration ('max_iterations') on the foreach action. The default remains 100.	2019-08-27 16:57:20 -05:00
Armin Braun	fdef293c81	Fix RegressionTests#fromXContent (#46029 ) * The `trainingPercent` must be between `1` and `100`, not `0` and `100` which is causing test failures	2019-08-27 18:24:26 +03:00
Dimitris Athanasiou	873ad3f942	[7.x][ML] Add option to regression to randomize training set (#45969 ) (#46017 ) Adds a parameter `training_percent` to regression. The default value is `100`. When the parameter is set to a value less than `100`, from the rows that can be used for training (ie. those that have a value for the dependent variable) we randomly choose whether to actually use for training. This enables splitting the data into a training set and the rest, usually called testing, validation or holdout set, which allows for validating the model on data that have not been used for training. Technically, the analytics process considers as training the data that have a value for the dependent variable. Thus, when we decide a training row is not going to be used for training, we simply clear the row's dependent variable.	2019-08-27 17:53:11 +03:00
Yogesh Gaikwad	7b6246ec67	Add `manage_own_api_key` cluster privilege (#45897 ) (#46023 ) The existing privilege model for API keys with privileges like `manage_api_key`, `manage_security` etc. are too permissive and we would want finer-grained control over the cluster privileges for API keys. Previously APIs created would also need these privileges to get its own information. This commit adds support for `manage_own_api_key` cluster privilege which only allows api key cluster actions on API keys owned by the currently authenticated user. Also adds support for retrieval of the API key self-information when authenticating via API key without the need for the additional API key privileges. To support this privilege, we are introducing additional authentication context along with the request context such that it can be used to authorize cluster actions based on the current user authentication. The API key get and invalidate APIs introduce an `owner` flag that can be set to true if the API key request (Get or Invalidate) is for the API keys owned by the currently authenticated user only. In that case, `realm` and `username` cannot be set as they are assumed to be the currently authenticated ones. The changes cover HLRC changes, documentation for the API changes. Closes #40031	2019-08-28 00:44:23 +10:00
Dimitris Athanasiou	dd6c13fdf9	[ML] Add description to DF analytics (#45774 ) (#46019 )	2019-08-27 15:48:59 +03:00
Albert Zaharovits	1ebee5bf9b	PKI realm authentication delegation (#45906 ) This commit introduces PKI realm delegation. This feature supports the PKI authentication feature in Kibana. In essence, this creates a new API endpoint which Kibana must call to authenticate clients that use certificates in their TLS connection to Kibana. The API call passes to Elasticsearch the client's certificate chain. The response contains an access token to be further used to authenticate as the client. The client's certificates are validated by the PKI realms that have been explicitly configured to permit certificates from the proxy (Kibana). The user calling the delegation API must have the delegate_pki privilege. Closes #34396	2019-08-27 14:42:46 +03:00
Zachary Tong	943a016bb2	Add Cumulative Cardinality agg (and Data Science plugin) (#45990 ) This adds a pipeline aggregation that calculates the cumulative cardinality of a field. It does this by iteratively merging in the HLL sketch from consecutive buckets and emitting the cardinality up to that point. This is useful for things like finding the total "new" users that have visited a website (as opposed to "repeat" visitors). This is a Basic+ aggregation and adds a new Data Science plugin to house it and future advanced analytics/data science aggregations.	2019-08-26 16:19:55 -04:00
Andrey Ershov	479ab9b8db	Fix plaintext on TLS port logging (#45852 ) Today if non-TLS record is received on TLS port generic exception will be logged with the stack-trace. SSLExceptionHelper.isNotSslRecordException method does not work because it's assuming that NonSslRecordException would be top-level. This commit addresses the issue and the log would be more concise. (cherry picked from commit 6b83527bf0c23d4d5b97fab7f290c43432945d4f)	2019-08-26 12:32:35 +02:00
Ioannis Kakavas	2bee27dd54	Allow Transport Actions to indicate authN realm (#45946 ) This commit allows the Transport Actions for the SSO realms to indicate the realm that should be used to authenticate the constructed AuthenticationToken. This is useful in the case that many authentication realms of the same type have been configured and where the caller of the API(Kibana or a custom web app) already know which realm should be used so there is no need to iterate all the realms of the same type. The realm parameter is added in the relevant REST APIs as optional so as not to introduce any breaking change.	2019-08-25 19:36:41 +03:00
Dimitris Athanasiou	be554fe5f0	[7.x][ML] Improve progress reportings for DF analytics (#45856 ) (#45910 ) Previously, the stats API reports a progress percentage for DF analytics tasks that are running and are in the `reindexing` or `analyzing` state. This means that when the task is `stopped` there is no progress reported. Thus, one cannot distinguish between a task that never run to one that completed. In addition, there are blind spots in the progress reporting. In particular, we do not account for when data is loaded into the process. We also do not account for when results are written. This commit addresses the above issues. It changes progress to being a list of objects, each one describing the phase and its progress as a percentage. We currently have 4 phases: reindexing, loading_data, analyzing, writing_results. When the task stops, progress is persisted as a document in the state index. The stats API now reports progress from in-memory if the task is running, or returns the persisted document (if there is one).	2019-08-23 23:04:39 +03:00
Martijn van Groningen	cb42e19a32	Change how type is stored in an enrich policy. (#45789 ) A policy type controls how the enrich index is created and the query executed against the match field. Currently there is a single policy type (`exact_match`). In the near future more policy types will be added and different policy may have different configuration options. For this reason type should be a json object instead of a string field: ``` { "exact_match": { ... } } ``` instead of: ``` { "type": "exact_match", ... } ``` This will make streaming parsing of enrich policies easier as in the new format, the parsing code can know ahead what configuration fields to expect. In the latter format that is not possible if the type field appears not as the first field. Relates to #32789	2019-08-23 13:43:38 +02:00
Martijn van Groningen	837cfa2640	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-23 11:22:27 +02:00
markharwood	217e41ab6c	Search - added HLRC support for PinnedQueryBuilder (#45779 ) (#45853 ) Added HLRC support for PinnedQueryBuilder Related #44074	2019-08-23 09:22:17 +01:00
Tim Vernum	f94e4a9151	Set security index refresh interval to 1s (#45888 ) The security indices were being created without specifying the refresh interval, which means it would inherit a value from any templates that exists. However, certain security functionality depends on being able to wait_for refresh, and causes errors (e.g. in Kibana) if that time exceeds 30s. This commit changes the security indices configuration to always be created with a 1s refresh interval. This prevents any templates from inadvertantly interfering with the proper functioning of security. It is possible for an administrator to explicitly change the refresh interval after the indices have been created. Backport of: #45434	2019-08-23 12:41:37 +10:00
Tim Vernum	029725fc35	Add SSL/TLS settings for watcher email (#45836 ) This change adds a new SSL context xpack.notification.email.ssl.* that supports the standard SSL configuration settings (truststore, verification_mode, etc). This SSL context is used when configuring outbound SMTP properties for watcher email notifications. Backport of: #45272	2019-08-23 10:13:51 +10:00
Benjamin Trent	8e3c54fff7	[7.x] [ML] Adding data frame analytics stats to _usage API (#45820 ) (#45872 ) * [ML] Adding data frame analytics stats to _usage API (#45820) * [ML] Adding data frame analytics stats to _usage API * making the size of analytics stats 10k * adjusting backport	2019-08-22 15:15:41 -05:00
Benjamin Trent	e50a78cf50	[ML-DataFrame] version data frame transform internal index (#45375 ) (#45837 ) Adds index versioning for the internal data frame transform index. Allows for new indices to be created and referenced, `GET` requests now query over the index pattern and takes the latest doc (based on INDEX name).	2019-08-22 11:46:30 -05:00
Przemysław Witek	7512337922	[7.x] Allow the user to specify 'query' in Evaluate Data Frame request (#45775 ) (#45825 )	2019-08-22 11:14:26 +02:00
Gordon Brown	47b1e2b3d0	[7.x] Use rollover for SLM's history indices (#45686 ) Following our own guidelines, SLM should use rollover instead of purely time-based indices to keep shard counts low. This commit implements lazy index creation for SLM's history indices, indexing via an alias, and rollover in the built-in ILM policy.	2019-08-21 13:42:11 -06:00
Martijn van Groningen	2677ac14d2	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-21 14:28:17 +02:00
Przemysław Witek	bf701b83d2	Shorten field names in EstimateMemoryUsageResponse (#45719 ) (#45772 )	2019-08-21 12:45:09 +02:00
Dimitris Athanasiou	d5c3d9b50f	[7.x][ML] Do not skip rows with missing values for regression (#45751 ) (#45754 ) Regression analysis support missing fields. Even more, it is expected that the dependent variable has missing fields to the part of the data frame that is not for training. This commit allows to declare that an analysis supports missing values. For such analysis, rows with missing values are not skipped. Instead, they are written as normal with empty strings used for the missing values. This also contains a fix to the integration test. Closes #45425	2019-08-21 08:15:38 +03:00
Benjamin Trent	ba7b677618	[ML] better handle empty results when evaluating regression (#45745 ) (#45759 ) * [ML] better handle empty results when evaluating regression * adding new failure test to ml_security black list * fixing equality check for regression results	2019-08-20 17:37:04 -05:00
Michael Basnight	e3373d349b	Consolidate enrich list all and get by name APIs (#45705 ) The get and list APIs are a single API in this commit. Whether requesting one named policy or all policies, a list of policies is returened. The list API code has all been removed and the GET api is what remains, which contains much of the list response code.	2019-08-20 10:29:59 -05:00
Przemysław Witek	b37ebd1adf	Prepare the codebase for new Auditor subclasses (#45716 ) (#45731 )	2019-08-20 16:03:50 +02:00
Przemysław Witek	80dd0a0948	Get rid of EstimateMemoryUsageRequest and EstimateMemoryUsageAction.Request. (#45718 ) (#45725 )	2019-08-20 15:49:17 +02:00
Benjamin Trent	88641a08af	[ML][Data frame] fixing failure state transitions and race condition (#45627 ) (#45656 ) * [ML][Data frame] fixing failure state transitions and race condition (#45627) There is a small window for a race condition while we are flagging a task as failed. Here are the steps where the race condition occurs: 1. A failure occurs 2. Before `AsyncTwoPhaseIndexer` calls the `onFailure` handler it does the following: a. `finishAndSetState()` which sets the IndexerState to STARTED b. `doSaveState(...)` which attempts to save the current state of the indexer 3. Another trigger is fired BEFORE `onFailure` can fire, but AFTER `finishAndSetState()` occurs. The trick here is that we will eventually set the indexer to failed, but possibly not before another trigger had the opportunity to fire. This could obviously cause some weird state interactions. To combat this, I have put in some predicates to verify the state before taking actions. This is so if state is indeed marked failed, the "second trigger" stops ASAP. Additionally, I move the task state checks INTO the `start` and `stop` methods, which will now require a `force` parameter. `start`, `stop`, `trigger` and `markAsFailed` are all `synchronized`. This should gives us some guarantees that one will not switch states out from underneath another. I also flag the task as `failed` BEFORE we successfully write it to cluster state, this is to allow us to make the task fail more quickly. But, this does add the behavior where the task is "failed" but the cluster state does not indicate as much. Adding the checks in `start` and `stop` will handle this "real state vs cluster state" race condition. This has always been a problem for `_stop` as it is not a master node action and doesn’t always have the latest cluster state. closes #45609 Relates to #45562 * [ML][Data Frame] moves failure state transition for MT safety (#45676) * [ML][Data Frame] moves failure state transition for MT safety * removing unused imports	2019-08-20 07:30:17 -05:00
Martijn van Groningen	5ea0985711	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-16 09:47:11 +02:00
Benjamin Trent	fde5dae387	[ML][Data Frame] adjusting change detection workflow (#45511 ) (#45580 ) * [ML][Data Frame] adjusting change detection workflow * adjusting for PR comment * disallowing null as an argument value	2019-08-14 17:26:24 -05:00
Nick Knize	647a8308c3	[SPATIAL] Backport new ShapeFieldMapper and ShapeQueryBuilder to 7x (#45363 ) * Introduce Spatial Plugin (#44389) Introduce a skeleton Spatial plugin that holds new licensed features coming to Geo/Spatial land! * [GEO] Refactor DeprecatedParameters in AbstractGeometryFieldMapper (#44923) Refactor DeprecatedParameters specific to legacy geo_shape out of AbstractGeometryFieldMapper.TypeParser#parse. * [SPATIAL] New ShapeFieldMapper for indexing cartesian geometries (#44980) Add a new ShapeFieldMapper to the xpack spatial module for indexing arbitrary cartesian geometries using a new field type called shape. The indexing approach leverages lucene's new XYShape field type which is backed by BKD in the same manner as LatLonShape but without the WGS84 latitude longitude restrictions. The new field mapper builds on and extends the refactoring effort in AbstractGeometryFieldMapper and accepts shapes in either GeoJSON or WKT format (both of which support non geospatial geometries). Tests are provided in the ShapeFieldMapperTest class in the same manner as GeoShapeFieldMapperTests and LegacyGeoShapeFieldMapperTests. Documentation for how to use the new field type and what parameters are accepted is included. The QueryBuilder for searching indexed shapes is provided in a separate commit. * [SPATIAL] New ShapeQueryBuilder for querying indexed cartesian geometry (#45108) Add a new ShapeQueryBuilder to the xpack spatial module for querying arbitrary Cartesian geometries indexed using the new shape field type. The query builder extends AbstractGeometryQueryBuilder and leverages the ShapeQueryProcessor added in the previous field mapper commit. Tests are provided in ShapeQueryTests in the same manner as GeoShapeQueryTests and docs are updated to explain how the query works.	2019-08-14 16:35:10 -05:00
Benjamin Trent	0c343d8443	[7.x] [ML][Transforms] adjusting stats.progress for cont. transforms (#45361 ) (#45551 ) * [ML][Transforms] adjusting stats.progress for cont. transforms (#45361) * [ML][Transforms] adjusting stats.progress for cont. transforms * addressing PR comments * rename fix * Adjusting bwc serialization versions	2019-08-14 13:08:27 -05:00
Martijn van Groningen	43b8ab607d	Improve naming of enrich policy fields. (#45494 ) Renamed `enrich_key` to `match_field` and renamed `enrich_values` to `enrich_fields`. Relates #32789	2019-08-14 11:45:22 +02:00
Przemysław Witek	df574e5168	[7.x] Implement ml/data_frame/analytics/_estimate_memory_usage API endpoint (#45188 ) (#45510 )	2019-08-14 08:26:03 +02:00
Martijn van Groningen	0353eb9291	required changes after merging in upstream branch	2019-08-13 09:17:57 +02:00
Martijn van Groningen	1951cdf1cb	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-13 09:12:31 +02:00
Yogesh Gaikwad	471d940c44	Refactor cluster privileges and cluster permission (#45265 ) (#45442 ) The current implementations make it difficult for adding new privileges (example: a cluster privilege which is more than cluster action-based and not exposed to the security administrator). On the high level, we would like our cluster privilege either: - a named cluster privilege This corresponds to `cluster` field from the role descriptor - or a configurable cluster privilege This corresponds to the `global` field from the role-descriptor and allows a security administrator to configure them. Some of the responsibilities like the merging of action based cluster privileges are now pushed at cluster permission level. How to implement the predicate (using Automaton) is being now enforced by cluster permission. `ClusterPermission` helps in enforcing the cluster level access either by performing checks against cluster action and optionally against a request. It is a collection of one or more permission checks where if any of the checks allow access then the permission allows access to a cluster action. Implementations of cluster privilege must be able to provide information regarding the predicates to the cluster permission so that can be enforced. This is enforced by making implementations of cluster privilege aware of cluster permission builder and provide a way to specify how the permission is to be built for a given privilege. This commit renames `ConditionalClusterPrivilege` to `ConfigurableClusterPrivilege`. `ConfigurableClusterPrivilege` is a renderable cluster privilege exposed as a `global` field in role descriptor. Other than this there is a requirement where we would want to know if a cluster permission is implied by another cluster-permission (`has-privileges`). This is helpful in addressing queries related to privileges for a user. This is not just simply checking of cluster permissions since we do not have access to runtime information (like request object). This refactoring does not try to address those scenarios. Relates #44048	2019-08-13 09:06:18 +10:00
Armin Braun	a9e1402189	Remove Settings from BaseRestRequest Constructor (#45418 ) (#45429 ) * Resolving the todo, cleaning up the unused `settings` parameter * Cleaning up some other minor dead code in affected classes	2019-08-12 05:14:45 +02:00
Benjamin Trent	fac1a6f8e8	[ML][Data Frame] have DataFrameTransformConfigUpdate#apply set Version (#45391 ) (#45400 )	2019-08-09 14:32:49 -05:00
Hendrik Muhs	bf4da6c6ad	[ML-DataFrame] fix starting a batch data frame after stopping at runtime (#45340 ) (#45381 ) fix loading of next checkpoint after data frame transform has been stopped/started within one run closes #45339	2019-08-09 20:30:11 +02:00
Dimitris Athanasiou	27497ff75f	[7.x][ML] Add regression analysis to DF analytics (#45292 ) (#45388 ) This commit adds a first draft of a regression analysis to data frame analytics. There is high probability that the exact syntax might change. This commit adds the new analysis type and its parameters as well as appropriate validation. It also modifies the extractor and the fields detector to be able to handle categorical fields as regression analysis supports them.	2019-08-09 19:31:13 +03:00
David Roberts	14545f8958	[ML-DataFrame] Combine task_state and indexer_state in _stats (#45324 ) This commit replaces task_state and indexer_state in the data frame _stats output with a single top level state that combines the two. It is defined as: - failed if what's currently reported as task_state is failed - stopped if there is no persistent task - Otherwise what's currently reported as indexer_state Backport of #45276	2019-08-08 16:24:26 +01:00
Martijn van Groningen	708f856940	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-08 16:52:45 +02:00
Benjamin Trent	5db9982f71	[7.x] [ML][Data Frame] Add update transform api endpoint (#45154 ) (#45279 ) * [ML][Data Frame] Add update transform api endpoint (#45154) This adds the ability to `_update` stored data frame transforms. All mutable fields are applied when the next checkpoint starts. The exception being `description`. This PR contains all that is necessary for this addition: * HLRC * Docs * Server side	2019-08-07 10:37:35 -05:00
Lee Hinman	c7ec0b8431	Include in-progress snapshot for a policy with get SLM policy… (#45245 ) This commit adds the "in_progress" key to the SLM get policy API, returning a policy that looks like: ```json { "daily-snapshots" : { "version" : 1, "modified_date" : "2019-08-05T18:41:48.778Z", "modified_date_millis" : 1565030508778, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "0 30 1 * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false }, "retention" : { "expire_after" : "10m" } }, "last_success" : { "snapshot_name" : "production-snap-2019.08.05-oxctmnobqye3luim4uejhg", "time_string" : "2019-08-05T18:42:23.257Z", "time" : 1565030543257 }, "next_execution" : "2019-08-06T01:30:00.000Z", "next_execution_millis" : 1565055000000, "in_progress" : { "name" : "production-snap-2019.08.05-oxctmnobqye3luim4uejhg", "uuid" : "t8Idqt6JQxiZrzp0Vt7z6g", "state" : "STARTED", "start_time" : "2019-08-05T18:42:22.998Z", "start_time_millis" : 1565030542998 } } } ``` These are only visible while the snapshot is being taken (or failed), since it reads from the cluster state rather than from the repository itself.	2019-08-07 08:29:49 -06:00
Tom Callahan	a7a419bee8	Change Ldap SDK License to LGPL-2.1 (#45116 ) We currently use the unboundid ldap SDK, which is triply licensed under GPL-2.0, LGPL-2.1, and the "UnboundID LDAP SDK Free Use License". We currently identify the license as the latter, but LGPL-2.1 is the one we should be using per our policy.	2019-08-06 16:48:09 -04:00
Zachary Tong	422aca9a5d	Fix Rollup job creation to work with templates (#43943 ) The PutJob API accidentally used an "expert" API of CreateIndexRequest. That API is semi-lenient to syntax; a type could be omitted and the request would work as expected. But if a type was omitted it would not merge with templates correctly, leading to index creation that only has the template and not the requested mappings in the request. This commit refactors the PutJob API to: - Include the type name - Use a less "expert" API in an attempt to future proof against errors - Uses an XContentBuilder instead of string replacing, removes json template	2019-08-06 10:53:44 -04:00
Christoph Büscher	3366726ad1	Enable reloading of synonym_graph filters (#45135 ) Reloading of synonym_graph filter doesn't work currently because the search time AnalysisMode doesn't get propagated to the TokenFilterFactory emitted by the graph filters getChainAwareTokenFilterFactory() method. This change fixes that. Closes #45127	2019-08-02 15:33:42 +02:00
Tim Vernum	e21d58541a	Improve errors when TLS files cannot be read (#45122 ) This change improves the exception messages that are thrown when the system cannot read TLS resources such as keystores, truststores, certificates, keys or certificate-chains (CAs). This change specifically handles: - Files that do not exist - Files that cannot be read due to file-system permissions - Files that cannot be read due to the ES security-manager Backport of: #44787	2019-08-02 12:29:43 +10:00
Tim Vernum	590777150f	Explicitly fail if a realm only exists in keystore (#45091 ) There are no realms that can be configured exclusively with secure settings. Every realm that supports secure settings also requires one or more non-secure settings. However, sometimes a node will be configured with entries in the keystore for which there is nothing in elasticsearch.yml - this may be because the realm we removed from the yml, but not deleted from the keystore, or it could be because there was a typo in the realm name which has accidentially orphaned the keystore entry. In these cases the realm building would fail, but the error would not always be clear or point to the root cause (orphaned keystore entries). RealmSettings would act as though the realm existed, but then fail because an incorrect combination of settings was provided. This change causes realm building to fail early, with an explicit message about incorrect keystore entries. Backport of: #44471	2019-08-02 12:28:59 +10:00
Martijn van Groningen	aae2f0cff2	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-01 13:38:03 +07:00
Mayya Sharipova	0c68765088	Adds usage stats for vectors (#45023 ) Example of usage: _xpack/usage "vectors": { "available": true, "enabled": true, "dense_vector_fields_count" : 1, "sparse_vector_fields_count" : 1, "dense_vector_dims_avg_count" : 100 } Backport for #44512	2019-07-31 12:32:41 -04:00
Lee Hinman	598c4e72f9	[7.x] Rename indexlifecycle to ilm and snapshotlifecycle to sl… (#44977 ) * Rename indexlifecycle to ilm and snapshotlifecycle to slm (#44917) As a followup to #44725 and #44608, which renamed the packages within the x-pack project, this renames the packages within the core x-pack project. It also renames 'snapshotlifecycle' within the HLRC to slm. * Fix one more import	2019-07-29 15:51:14 -06:00
Benjamin Trent	3b514f0dae	[ML] update Instant serialization (#44765 ) (#44954 ) * [ML] update Instant serialization * addressing PR comments * removing unused import	2019-07-29 13:06:56 -05:00
Martijn van Groningen	db49cb505e	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-07-29 14:45:10 +07:00
Gordon Brown	d4b2d21339	Add option to filter ILM explain response (#44777 ) In order to make it easier to interpret the output of the ILM Explain API, this commit adds two request parameters to that API: - `only_managed`, which causes the response to only contain indices which have `index.lifecycle.name` set - `only_errors`, which causes the response to contain only indices in an ILM error state "Error state" is defined as either being in the `ERROR` step or having `index.lifecycle.name` set to a policy that does not exist.	2019-07-26 11:57:38 -04:00
Przemysław Witek	79121ea127	[7.x] Implement exponential average search time per hour statistics. (#44683 ) (#44897 )	2019-07-26 15:56:34 +02:00
Przemysław Witek	8bb8543fdf	Treat PostDataActionResponse.DataCounts.bucketCount as incremental rather than absolute (total). (#44803 ) (#44856 )	2019-07-25 20:46:56 +02:00
James Baiera	c5528a25e6	Merge branch '7.x' into enrich-7.x	2019-07-25 13:12:56 -04:00
Hendrik Muhs	2ca6306452	do not assert on indexer state (#44854 ) remove the unreliable check for the state change fixes #44813	2019-07-25 16:39:24 +02:00
David Roberts	b2e969f4ba	[ML-DataFrame] Remove ID field from data frame indexer stats (#44848 ) This is a followup to #44350. The indexer stats used to be persisted standalone, but now are only persisted as part of a state-and-stats document. During the review of #44350 it was decided that we'll stick with this design, so there will never be a need for an indexer stats object to store its transform ID as it is stored on the enclosing document. This PR removes the indexer stats document ID. Backport of #44768	2019-07-25 15:19:32 +01:00
Przemysław Witek	53f409e5ae	Add result_type field to TimingStats and DatafeedTimingStats documents (#44812 ) (#44841 )	2019-07-25 10:11:55 +02:00
Gordon Brown	2ac54da60a	Fix swapped variables in error message (#44300 ) The alias name and index were in the incorrect order in this error message. This commit corrects the order.	2019-07-24 11:38:15 -04:00
Przemysław Witek	26da573e94	[ML] [7.x] Only emit deprecation warning if there was actual change of a datafeed's job_id. (#44755 ) * Only emit deprecation warning if there was actual change of a datafeed's job_id. * Add @Deprecated annotation to DatafeedUpdate.Builder#setJobId method	2019-07-24 10:03:25 +02:00
David Roberts	caf9411a72	[ML] Improve response format of data frame stats endpoint (#44743 ) This change adjusts the data frame transforms stats endpoint to return a structure that is easier to understand. This is a breaking change for clients of the data frame transforms stats endpoint, but the feature is in beta so stability is not guaranteed. Backport of #44350	2019-07-23 18:00:50 +01:00
Przemysław Witek	16c8e18013	Deprecate the ability to update datafeed's job_id. (#44691 ) (#44742 )	2019-07-23 14:48:56 +02:00
Jason Tedor	5878bde8dc	Rename SLM package to slm (#44608 ) This commit renames the SLM package from snapshotlifecycle to slm. We have all come to know index lifecycle management as ILM, the APIs and settings use ilm, and it would be nice of the package did too. For SLM, let's use slm for all of these including the package name from the beginning.	2019-07-23 07:35:06 +09:00
Benjamin Trent	4456850a8e	[7.x] [ML][Data Frame] Add optional defer_validation param to PUT (#44455 ) (#44697 ) * [ML][Data Frame] Add optional defer_validation param to PUT (#44455) * [ML][Data Frame] Add optional defer_validation param to PUT * addressing PR comments * reverting bad replace * addressing pr comments * Update put-transform.asciidoc * Update put-transform.asciidoc * Update put-transform.asciidoc * adjusting for backport * fixing imports * [DOCS] Fixes formatting in create data frame transform API	2019-07-22 15:12:55 -05:00
James Baiera	fc20264b99	Add Enrich index background task to cleanup old indices (#43746 ) This PR adds a background maintenance task that is scheduled on the master node only. The deletion of an index is based on if it is not linked to a policy or if the enrich alias is not currently pointing at it. Synchronization has been added to make sure that no policy executions are running at the time of cleanup, and if any executions do occur, the marking process delays cleanup until next run.	2019-07-22 14:41:22 -04:00
Benjamin Trent	06e21f7902	[7.x] [ML][Data Frame] adding force delete (#44590 ) (#44696 ) * [ML][Data Frame] adding force delete (#44590) * [ML][Data Frame] adding force delete * Update delete-transform.asciidoc * adjusting for backport	2019-07-22 13:13:25 -05:00
Hendrik Muhs	4387d81e5b	add a test to check excecution flow (#44481 ) add a test for the execution flow of a2p indexer	2019-07-22 17:07:11 +02:00
Tal Levy	7c84636029	Remove StreamOutput #writeOptionalStreamable and #writeStreamableList (#44602 ) (#44643 ) remove usages of writeOptionalStreamable and writeStreambaleList relates #34389.	2019-07-19 15:55:53 -07:00
Benjamin Trent	2e303fc5f7	[ML][Data Frame] adding dynamic cluster setting for failure retries (#44577 ) (#44639 ) This adds a new dynamic cluster setting `xpack.data_frame.num_transform_failure_retries`. This setting indicates how many times non-critical failures should be retried before a data frame transform is marked as failed and should stop executing. At the time of this commit; Min: 0, Max: 100, Default: 10	2019-07-19 16:17:39 -05:00
Ryan Ernst	f193d14764	Convert remaining Action Response/Request to writeable.reader (#44528 ) (#44607 ) This commit converts readFrom to ctor with StreamInput on the remaining ActionResponse and ActionRequest classes. relates #34389	2019-07-19 13:33:38 -07:00
Christoph Büscher	eafe54c81c	Fix AnalysisMode propagation in NamedAnalyzer (#44626 ) NamedAnalyzer should return the same AnalysisMode than any custom analyzer it wraps, otherwise AnalysisMode.ALL. This used to be only CustomAnalyzer in the past, but with the introduction of the ReloadableCustomAnalyzer this needs to be added as an option where the analysis mode gets propagated. Closes #44625	2019-07-19 18:18:43 +02:00
Ryan Ernst	60785a9fa8	Convert several direct uses of Streamable to Writeable (#44586 ) (#44604 ) This commit converts several utility classes that implement Streamable to have StreamInput constructors. It also adds a default version of readFrom to Streamable so that overriding to throw UOE is not necessary. relates #34389	2019-07-18 21:25:44 -07:00
Ryan Ernst	13f46aa801	Convert index and persistent actions/response to writeable (#44582 ) (#44601 ) This commit converts several more classes from streamable to writeable in server, mostly within the o.e.index and o.e.persistent packages. relates #34389	2019-07-18 18:32:09 -07:00
Tal Levy	03f5084ac7	remove usages of #readOptionalStreamable, #readStreamableList. (#44578 ) (#44598 ) This commit removes references to Streamable from StreamInput. This is all a part of the effort to remove Streamable usage. relates #34389.	2019-07-18 16:19:02 -07:00
Lee Hinman	3001f7941f	Allow empty configuration for SLM policies (#44465 ) * Allow empty configuration for SLM policies When putting or updating a snapshot lifecycle policy it was not possible to elide the `config` map. This commit makes the configuration optional, the same way that it is when taking a snapshot. Relates to #38461 * Add Objects.requireNonNull for required parts of the policy	2019-07-18 16:20:31 -06:00
Lee Hinman	fe2ef66e45	Expose index age in ILM explain output (#44457 ) * Expose index age in ILM explain output This adds the index's age to the ILM explain output, for example: ``` { "indices" : { "ilm-000001" : { "index" : "ilm-000001", "managed" : true, "policy" : "full-lifecycle", "lifecycle_date" : "2019-07-16T19:48:22.294Z", "lifecycle_date_millis" : 1563306502294, "age" : "1.34m", "phase" : "hot", "phase_time" : "2019-07-16T19:48:22.487Z", ... etc ... } } } ``` This age can be used to tell when ILM will transition the index to the next phase, based on that phase's `min_age`. Resolves #38988 * Expose age in getters and in HLRC	2019-07-18 15:33:45 -06:00
Tal Levy	c8a8915b27	migrate rollup/monitoring/graph/watcher actions to Writeable (#44464 ) (#44538 ) this commit migrates leftover actions from a few x-pack plugins to the new Writeable.Reader infrastructure. relates #34389.	2019-07-18 08:42:56 -07:00
Christoph Büscher	e9f257b4d9	Fix ReloadDetailsTests compilation	2019-07-18 11:55:22 +02:00
Christoph Büscher	eed2db8947	Add stream serialization to ReloadAnalyzersResponse (#44420 ) This change adds Writeable support to ReloadAnalyzersResponse that is required when using the transport client in 7.x. Closes #44383	2019-07-18 11:11:54 +02:00
David Kyle	0fc091f166	Enable XLint warnings for ML (#44346 ) Removes the warning suppression -Xlint:-deprecation,-rawtypes,-serial,-try,-unchecked. Many warnings were unchecked warnings in the test code often because of the use of mocks. These are suppressed with @SuppressWarning	2019-07-18 09:33:37 +01:00
Ryan Ernst	edd26339c5	Convert remaining request classes in xpack core to writeable.reader (#44524 ) (#44534 ) This commit converts all remaining classes extending ActionRequest in xpack core to have a StreamInput constructor. relates #34389	2019-07-18 01:11:45 -07:00
Tal Levy	a5ad59451c	migrate more ML actions off of using Request suppliers (#44462 ) (#44529 ) many classes still use the Streamable constructors of HandledTransportAction, this commit moves more of those classes to the new Writeable constructors. relates #34389.	2019-07-17 20:28:29 -07:00
Tal Levy	075a3f0e99	remove usage of ActionType#(String) (#44459 ) (#44526 ) this commit removes usage of the deprecated constructor with a single argument and no Writeable.Reader. The purpose of this is to reduce the boilerplate necessary for properly implementing a new action, as well as reducing the chances of using the incorrect super constructor while classes are being migrated to Writeable relates #34389.	2019-07-17 20:28:11 -07:00
Ryan Ernst	2a2686e6e7	Convert remaining ActionTypes to writeable in xpack core (#44467 ) (#44525 ) This commit converts all remaining ActionType response classes to writeable in xpack core. It also converts a few from server which were used by xpack core. relates #34389	2019-07-17 18:01:45 -07:00
Ryan Ernst	17c4b2b839	Convert MasterNodeRequest to implement Writeable.Reader (#44452 ) (#44513 ) This commit converts all MasterNodeRequest subclasses to fullfill Writeable.Reader constructors. relates #34389	2019-07-17 18:01:29 -07:00
Ryan Ernst	0755a13c9f	Convert AcknowledgedRequest to Writeable.Reader (#44412 ) (#44454 ) This commit adds constructors to AcknolwedgedRequest subclasses to implement Writeable.Reader, and ensures all future subclasses implement the same. relates #34389	2019-07-17 11:17:36 -07:00
Yannick Welsch	d98b3e4760	Move frozen indices to x-pack module (#44490 ) Backport of #44408 and #44286.	2019-07-17 16:53:10 +02:00

... 4 5 6 7 8 ...

1765 Commits