OpenSearch

Commit Graph

Author	SHA1	Message	Date
Lee Hinman	5af66d79ef	Add SLM support to xpack usage and info APIs (#48149 ) * Add SLM support to xpack usage and info APIs This is a backport of #48096 This adds the missing xpack usage and info information into the `/_xpack` and `/_xpack/usage` APIs. The output now looks like: ``` GET /_xpack/usage { ... "slm" : { "available" : true, "enabled" : true, "policy_count" : 1, "policy_stats" : { "retention_runs" : 0, ... } } ``` and ``` GET /_xpack { ... "features" : { ... "slm" : { "available" : true, "enabled" : true }, ... } } ``` Relates to #43663 * Fix missing license	2019-10-16 21:06:27 -06:00
Benjamin Trent	0dddbb5b42	[ML] Parse and index inference model (#48016 ) (#48152 ) This adds parsing an inference model as a possible result of the analytics process. When we do parse such a model we persist a `TrainedModelConfig` into the inference index that contains additional metadata derived from the running job.	2019-10-16 15:46:20 -04:00
Przemysław Witek	8f815240b3	[7.x] Allow integer types for classification's dependent variable (#47902 ) (#48080 )	2019-10-16 11:09:56 +02:00
Przemysław Witek	eaa56344b5	Verify that the failure reason of analytics process is empty (#48042 ) (#48071 )	2019-10-15 18:33:20 +02:00
Martijn van Groningen	aff0c9babc	This commits merges (#48040 ) the enrich-7.x feature branch, which is backport merge and adds a new ingest processor, named enrich processor, that allows document being ingested to be enriched with data from other indices. Besides a new enrich processor, this PR adds several APIs to manage an enrich policy. An enrich policy is in charge of making the data from other indices available to the enrich processor in an efficient manner. Related to #32789	2019-10-15 17:31:45 +02:00
Benjamin Trent	361e7ad0ef	[ML][Transforms] fix bwc serialization with 7.3 (#48021 ) (#48048 )	2019-10-15 07:52:13 -04:00
David Roberts	83321b0e5e	[ML] Fix isNoop() for datafeed update (#48046 ) max_empty_searches = -1 in a datafeed update implies max_empty_searches will be unset on the datafeed when the update is applied. The isNoop() method needs to take this -1 to null equivalence into account.	2019-10-15 12:28:53 +01:00
David Roberts	984323783e	[ML][7.x] Add lazy assignment job config option (#47993 ) This change adds: - A new option, allow_lazy_open, to anomaly detection jobs - A new option, allow_lazy_start, to data frame analytics jobs Both work in the same way: they allow a job to be opened/started even if no ML node exists that can accommodate the job immediately. In this situation the job waits in the opening/starting state until ML node capacity is available. (The starting state for data frame analytics jobs is new in this change.) Additionally, the ML nightly maintenance tasks now creates audit warnings for ML jobs that are unassigned. This means that jobs that cannot be assigned to an ML node for a very long time will show a yellow warning triangle in the UI. A final change is that it is now possible to close a job that is not assigned to a node without using force. This is because previously jobs that were open but not assigned to a node were an aberration, whereas after this change they'll be relatively common.	2019-10-15 06:55:11 +01:00
Martijn van Groningen	cc4b6c43b3	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-15 07:23:47 +02:00
James Baiera	18d7e32b7d	Add wait for completion for Enrich policy execution (#47886 ) This PR adds the ability to run the enrich policy execution task in the background, returning a task id instead of waiting for the completed operation.	2019-10-14 16:05:28 -04:00
Gordon Brown	699d4d4c6f	Manage retention of partial snapshots in SLM (#47833 ) Currently, partial snapshots will eventually build up unless they are manually deleted. Partial snapshots may be useful if there is not a more recent successful snapshot, but should eventually be deleted if they are no longer useful. With this change, partial snapshots are deleted using the following strategy: PARTIAL snapshots will be kept until the configured expire_after period has passed, if present, and then be deleted. If there is no configured expire_after in the retention policy, then they will be deleted if there is at least one more recent successful snapshot from this policy (as they may otherwise be useful for troubleshooting purposes). Partial snapshots are not counted towards either min_count or max_count.	2019-10-14 10:19:57 -06:00
David Roberts	1ca25bed38	[ML][7.x] Add option to stop datafeed that finds no data (#47995 ) Adds a new datafeed config option, max_empty_searches, that tells a datafeed that has never found any data to stop itself and close its associated job after a certain number of real-time searches have returned no data. Backport of #47922	2019-10-14 17:19:13 +01:00
Martijn van Groningen	d4901a71d7	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-14 10:27:17 +02:00
Ioannis Kakavas	9ee7b3743e	Add FIPS 140 mode to XPack Usage API (#47278 ) (#47976 ) This change adds support for the FIPS 140 mode feature to be retrieved via the XPack Usage API.	2019-10-14 10:40:24 +03:00
Tanguy Leroux	742fa818b8	Add Pause/Resume Auto Follower APIs (#47510 ) (#47904 ) This commit adds two APIs that allow to pause and resume CCR auto-follower patterns: // pause auto-follower POST /_ccr/auto_follow/my_pattern/pause // resume auto-follower POST /_ccr/auto_follow/my_pattern/resume The ability to pause and resume auto-follow patterns can be useful in some situations, including the rolling upgrades of cluster using a bi-directional cross-cluster replication scheme (see #46665). This commit adds a new active flag to the AutoFollowPattern and adapts the AutoCoordinator and AutoFollower classes so that it stops to fetch remote's cluster state when all auto-follow patterns associate to the remote cluster are paused. When an auto-follower is paused, remote indices that match the pattern are just ignored: they are not added to the pattern's followed indices uids list that is maintained in the local cluster state. This way, when the auto-follow pattern is resumed the indices created in the remote cluster in the meantime will be picked up again and added as new following indices. Indices created and then deleted in the remote cluster will be ignored as they won't be seen at all by the auto-follower pattern at resume time. Backport of #47510 for 7.x	2019-10-13 09:22:51 +02:00
Yogesh Gaikwad	ac209c142c	Remove uniqueness constraint for API key name and make it optional (#47549 ) (#47959 ) Since we cannot guarantee the uniqueness of the API key `name` this commit removes the constraint and makes this field optional. Closes #46646	2019-10-12 22:22:16 +11:00
Przemyslaw Gomulka	6ab58de7ef	[7.x] Enable ResolverStyle.STRICT for java formatters backport(#46675 ) (#47913 ) Joda was using ResolverStyle.STRICT when parsing. This means that date will be validated to be a correct year, year-of-month, day-of-month However, we also want to make it works with Year-Of-Era as Joda used to, hence custom temporalquery.localdate in DateFormatters.from Within DateFormatters we use the correct uuuu year instead of yyyy year of era worth noting: if yyyy(without an era) is used in code, the parsing result will be a TemporalAccessor which will fail to be converted into LocalDate. We mostly use DateFormatters.from so this takes care of this. If possible the uuuu format should be used.	2019-10-11 21:19:56 +02:00
Chris Roberson	c57191b163	[Monitoring] Add new cluster privilege now necessary for the stack monitoring ui (#47871 ) (#47915 ) * Add new cluster privilege now necessary for the stack monitoring ui * PR feedback, and add test	2019-10-11 14:54:59 -04:00
James Baiera	73263c654a	Add basic task support for executing enrich policies (#47523 ) Changes the execution logic to create a new task using the execute request, and attaches the new task to the policy runner to be updated. Also, a new response is now returned from the execute api, which contains either the task id of the execution, or the completed status of the run. The fields are mutually exclusive to make it easier to discern what type of response it is.	2019-10-11 13:32:06 -04:00
Przemysław Witek	c62fe8c344	Require that the dependent variable column has at most 2 distinct values in classfication analysis. (#47858 ) (#47906 )	2019-10-11 14:57:08 +02:00
Hendrik Muhs	3da91d5f7a	[Transform] Rename internal indexes for transform plugin (#47788 ) (#47900 ) rename internal indexes of transform plugin - rename audit index and create an alias for accessing it, BWC: add an alias for old indexes to keep them working, kibana UI will switch to use the read alias - rename config index and provide BWC to read from old and new ones	2019-10-11 14:16:17 +02:00
Martijn van Groningen	102016d571	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-10 14:44:05 +02:00
Hendrik Muhs	0e7869128a	[7.5][Transform] introduce new roles and deprecate old ones (#47780 ) (#47819 ) deprecate data_frame_transforms_{user,admin} roles and introduce transform_{user,admin} roles as replacement	2019-10-10 10:31:24 +02:00
Martijn van Groningen	aace42d38d	Add HLRC support for enrich stats API (#47306 ) This PR also includes HLRC docs for the enrich stats api. Relates to #32789	2019-10-10 09:08:29 +02:00
Martijn van Groningen	da1e2ea461	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-09 09:06:13 +02:00
Lee Hinman	fb7abe9fa4	Separate SLM stop/start/status API from ILM (#47710 ) * Separate SLM stop/start/status API from ILM This separates a start/stop/status API for SLM from being tied to ILM's operation mode. These APIs look like: ``` POST /_slm/stop POST /_slm/start GET /_slm/status ``` This allows administrators to have fine-grained control over preventing periodic snapshots and deletions while performing cluster maintenance. Relates to #43663 * Allow going from RUNNING to STOPPED * Align with the OperationMode rules * Fix slmStopping method * Make OperationModeUpdateTask constructor private * Wipe snapshots better in test	2019-10-08 17:21:38 -06:00
Gordon Brown	a492864a9d	Manage retention of failed snapshots in SLM (#47617 ) Failed snapshots will eventually build up unless they are deleted. While failures may not take up much space, they add noise to the list of snapshots and it's desirable to remove them when they are no longer useful. With this change, failed snapshots are deleted using the following strategy: `FAILED` snapshots will be kept until the configured `expire_after` period has passed, if present, and then be deleted. If there is no configured `expire_after` in the retention policy, then they will be deleted if there is at least one more recent successful snapshot from this policy (as they may otherwise be useful for troubleshooting purposes). Failed snapshots are not counted towards either `min_count` or `max_count`.	2019-10-08 17:07:08 -06:00
Dimitris Athanasiou	c1b0bfd74a	[7.x][ML] Unwrap exception causes before calling instanceof (#47676 ) (#47724 ) When exceptions could be returned from another node, the exception might be wrapped in a `RemoteTransportException`. In places where we handled specific exceptions using `instanceof` we ought to unwrap the cause first. This commit attempts to fix this issue after searching code in the ML plugin. Backport of #47676	2019-10-08 16:02:47 +03:00
Benjamin Trent	d33dbf82d4	[7.x] [ML][Inference] adjusting definition object schema and validation (#47447 ) (#47673 ) * [ML][Inference] adjusting definition object schema and validation (#47447) * [ML][Inference] adjusting definition object schema and validation * finalizing schema and fixing inference npe * addressing PR comments * fixing for backport	2019-10-08 07:11:05 -04:00
Hendrik Muhs	5e0e54f455	[Transform] move root endpoint to _transform with BWC layer (#47127 ) (#47682 ) move the main endpoint to /_transform/ from /_data_frame/transforms/ with providing backwards compatibility and deprecation warnings	2019-10-08 08:59:01 +02:00
Tal Levy	a17f394e27	Geo-Match Enrich Processor (#47243 ) (#47701 ) this commit introduces a geo-match enrich processor that looks up a specific `geo_point` field in the enrich-index for all entries that have a geo_shape match field that meets some specific relation criteria with the input field. For example, the enrich index may contain documents with zipcodes and their respective geo_shape. Ingesting documents with a geo_point field can be enriched with which zipcode they associate according to which shape they are contained within. this commit also refactors some of the MatchProcessor by moving a lot of the shared code to AbstractEnrichProcessor. Closes #42639.	2019-10-07 15:03:46 -07:00
Dimitris Athanasiou	7667ea5f6f	[7.x][ML] Additional outlier detection parameters (#47600 ) (#47669 ) Adds the following parameters to `outlier_detection`: - `compute_feature_influence` (boolean): whether to compute or not feature influence scores - `outlier_fraction` (double): the proportion of the data set assumed to be outlying prior to running outlier detection - `standardization_enabled` (boolean): whether to apply standardization to the feature values Backport of #47600	2019-10-07 18:21:33 +03:00
Yogesh Gaikwad	b6d1d2e6ec	Add 'create_doc' index privilege (#45806 ) (#47645 ) Use case: User with `create_doc` index privilege will be allowed to only index new documents either via Index API or Bulk API. There are two cases that we need to think: - User indexing a new document without specifying an Id. For this ES auto generates an Id and now ES version 7.5.0 onwards defaults to `op_type` `create` we just need to authorize on the `op_type`. - User indexing a new document with an Id. This is problematic as we do not know whether a document with Id exists or not. If the `op_type` is `create` then we can assume the user is trying to add a document, if it exists it is going to throw an error from the index engine. Given these both cases, we can safely authorize based on the `op_type` value. If the value is `create` then the user with `create_doc` privilege is authorized to index new documents. In the `AuthorizationService` when authorizing a bulk request, we check the implied action. This code changes that to append the `:op_type/index` or `:op_type/create` to indicate the implied index action.	2019-10-07 23:58:44 +11:00
Yogesh Gaikwad	7c862fe71f	Add support to retrieve all API keys if user has privilege (#47274 ) (#47641 ) This commit adds support to retrieve all API keys if the authenticated user is authorized to do so. This removes the restriction of specifying one of the parameters (like id, name, username and/or realm name) when the `owner` is set to `false`. Closes #46887	2019-10-07 23:58:21 +11:00
Martijn van Groningen	f2f2304c75	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-07 10:07:56 +02:00
Andrei Dan	4506b37ed5	ILM: Skip rolling indexes that are already rolled (#47324 ) (#47592 ) An index with an ILM policy that has a rollover action in one of the phases was rolled over when the ILM conditions dictated regardless if it was already rolled over (eg. manually after modifying an index template in order to force the creation of a new index that uses the new mappings). This changes this behaviour and has ILM check if the index it's about to roll has not been rolled over in the meantime. (cherry picked from commit 37d6106feeb9f9369519117c88a9e7e30f3ac797) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-10-07 07:47:47 +01:00
Lee Hinman	79376b7219	Set default SLM retention invocation time (#47604 ) This adds a default for the `slm.retention_schedule` setting, setting it to `0 30 1 * * ?` which is 1:30am every day. Having retention unset meant that it would never be invoked and clean up snapshots. We determined it would be better to have a default than never to be run. When coming to a decision, we weighed the option of an absolute time (such as 1:30am) versus a periodic invocation (like every 12 hours). In the end we decided on the absolute time because it has better predictability and consistency than a periodic invocation, which would rely on when the master node were elected or restarted. Relates to #43663	2019-10-04 15:00:20 -06:00
Przemysław Witek	ee952da2e2	[7.x] Implement evaluation API for multiclass classification problem (#47126 ) (#47343 )	2019-10-04 17:54:51 +02:00
Przemysław Witek	8c180a77f0	[7.x] Fix serialization of evaluation response. (#47557 ) (#47566 )	2019-10-04 15:12:18 +02:00
Przemysław Witek	ec9b77deaa	[7.x] Implement new analysis type: classification (#46537 ) (#47559 )	2019-10-04 13:47:19 +02:00
David Roberts	31a5e1c7ee	[ML] More accurate job memory overhead (#47516 ) When an ML job runs the memory required can be broken down into: 1. Memory required to load the executable code 2. Instrumented model memory 3. Other memory used by the job's main process or ancilliary processes that is not instrumented Previously we added a simple fixed overhead to account for 1 and 3. This was 100MB for anomaly detection jobs (large because of the completely uninstrumented categorization function and normalize process), and 20MB for data frame analytics jobs. However, this was an oversimplification because the executable code only needs to be loaded once per machine. Also the 100MB overhead for anomaly detection jobs was probably too high in most cases because categorization and normalization don't use _that_ much memory. This PR therefore changes the calculation of memory requirements as follows: 1. A per-node overhead of 30MB for _only_ the first job of any type to be run on a given node - this is to account for loading the executable code 2. The established model memory (if applicable) or model memory limit of the job 3. A per-job overhead of 10MB for anomaly detection jobs and 5MB for data frame analytics jobs, to account for the uninstrumented memory usage This change will enable more jobs to be run on the same node. It will be particularly beneficial when there are a large number of small jobs. It will have less of an effect when there are a small number of large jobs.	2019-10-04 09:57:31 +01:00
Alpar Torok	0a14bb174f	Remove eclipse conditionals (#44075 ) * Remove eclipse conditionals We used to have some meta projects with a `-test` prefix because historically eclipse could not distinguish between test and main source-sets and could only use a single classpath. This is no longer the case for the past few Eclipse versions. This PR adds the necessary configuration to correctly categorize source folders and libraries. With this change eclipse can import projects, and the visibility rules are correct e.x. auto compete doesn't offer classes from test code or `testCompile` dependencies when editing classes in `main`. Unfortunately the cyclic dependency detection in Eclipse doesn't seem to take the difference between test and non test source sets into account, but since we are checking this in Gradle anyhow, it's safe to set to `warning` in the settings. Unfortunately there is no setting to ignore it. This might cause problems when building since Eclipse will probably not know the right order to build things in so more wirk might be necesarry.	2019-10-03 11:55:00 +03:00
Lee Hinman	2e3eb4b24e	Add API to execute SLM retention on-demand (#47405 ) (#47463 ) * Add API to execute SLM retention on-demand (#47405) This is a backport of #47405 This commit adds the `/_slm/_execute_retention` API endpoint. This endpoint kicks off SLM retention and then returns immediately. This in particular allows us to run retention without scheduling it (for entirely manual invocation) or perform a one-off cleanup. This commit also includes HLRC for the new API, and fixes an issue in SLMSnapshotBlockingIntegTests where retention invoked prior to the test completing could resurrect an index the internal test cluster cleanup had already deleted. Resolves #46508 Relates to #43663	2019-10-02 12:29:04 -06:00
Lee Hinman	013d87d716	Fix AllocationRoutedStepTests.testConditionMetOnlyOneCopyAlloc… (#47313 ) * Fix AllocationRoutedStepTests.testConditionMetOnlyOneCopyAllocated These tests were using randomly generated includes/excludes/requires for routing, however, it was possible to generate mutually exclusive allocation settings (about 1 out of 50,000 times for my runs). This splits the test into three different tests, and removes the randomization (it doesn't add anything to the testing here) to fix the issue. Resolves #47142	2019-10-02 10:01:23 -06:00
Benjamin Trent	2228a7dd8d	[ML][Inference] adding ensemble model objects (#47241 ) (#47438 ) * [ML][Inference] adding ensemble model objects * addressing PR comments * Update TreeTests.java * addressing PR comments * fixing test	2019-10-02 09:49:46 -04:00
Dimitris Athanasiou	b9541eb3af	[7.x][ML] Make PUT data frame analytics action a master node action (… (#47433 ) While it seemed like the PUT data frame analytics action did not have to be a master node action as the config is stored in an index rather than the cluster state, there are other subtle nuances which make it worthwhile to convert it. In particular, it helps maintain order of execution for put actions which are anyhow user driven and are expected to have low volume. This commit converts `TransportPutDataFrameAnalyticsAction` from a handled transport action to a master node action. Note this means that the action might fail in a mixed cluster but as the API is still experimental and not widely used there will be few moments more suitable to make this change than now.	2019-10-02 16:24:21 +03:00
Yannick Welsch	7b2613db55	Allow optype CREATE for append-only indexing operations (#47169 ) Bulk requests currently do not allow adding "create" actions with auto-generated IDs. This commit allows using the optype CREATE for append-only indexing operations. This is mainly the user facing aspect of it.	2019-10-02 14:16:52 +02:00
David Roberts	4379a3c52b	[ML] Throttle the delete-by-query of expired results (#47177 ) Due to #47003 many clusters will have built up a large backlog of expired results. On upgrading to a version where that bug is fixed users could find that the first ML daily maintenance task deletes a very large amount of documents. This change introduces throttling to the delete-by-query that the ML daily maintenance uses to delete expired results to limit it to deleting an average 200 documents per second. (There is no throttling for state/forecast documents as these are expected to be lower volume.) Additionally a rough time limit of 8 hours is applied to the whole delete expired data action. (This is only rough as it won't stop part way through a single operation - it only checks the timeout between operations.) Relates #47103	2019-10-02 11:16:34 +01:00
Dimitris Athanasiou	36884a3c32	[7.x][ML] Restore analytics state if available (#47128 ) (#47393 ) This commit restores the model state if available in data frame analytics jobs. In addition, this changes the start API so that a stopped job can be restarted. As we now store the progress in the state index when the task is stopped, we can use it to determine what state the job was in when it got stopped. Note that in order to be able to distinguish between a job that runs for the first time and another that is restarting, we ensure reindexing progress is reported to be at least 1 for a running task.	2019-10-02 10:24:05 +03:00
Benjamin Trent	f5fe5e7cd6	[7.x] [ML][Inference] Adding preprocessors to definition object (#47320 ) (#47370 ) * [ML][Inference] Adding preprocessors to definition object (#47320) * [ML][Inference] Adding preprocessors to definition object * Update TrainedModelConfig.java * adjusting for backport	2019-10-01 13:31:25 -04:00
Albert Zaharovits	78558a7b2f	Fix AD realm additional metadata (#47179 ) Due to a regression bug the metadata Active Directory realm setting is ignored (it works correctly for the LDAP realm type). This commit redresses it. Closes #45848	2019-10-01 17:05:25 +03:00
Benjamin Trent	4335e07716	[7.x] [ML][Inference] adding .ml-inference* index and storage (#47267 ) (#47310 ) * [ML][Inference] adding .ml-inference* index and storage (#47267) * [ML][Inference] adding .ml-inference* index and storage * Addressing PR comments * Allowing null definition, adding validation tests for model config * fixing line length * adjusting for backport	2019-10-01 08:20:33 -04:00
Armin Braun	3d23cb44a3	Speed up Snapshot Finalization (#47283 ) (#47309 ) As a result of #45689 snapshot finalization started to take significantly longer than before. This may be a little unfortunate since it increases the likelihood of failing to finalize after having written out all the segment blobs. This change parallelizes all the metadata writes that can safely run in parallel in the finalization step to speed the finalization step up again. Also, this will generally speed up the snapshot process overall in case of large number of indices. This is also a nice to have for #46250 since we add yet another step (deleting of old index- blobs in the shards to the finalization.	2019-09-30 23:28:59 +02:00
Martijn van Groningen	fe937ea4b8	Add config namespace in get policy api response (#47162 ) Currently the policy config is placed directly in the json object of the toplevel `policies` array field. For example: ``` { "policies": [ { "match": { "name" : "my-policy", "indices" : ["users"], "match_field" : "email", "enrich_fields" : [ "first_name", "last_name", "city", "zip", "state" ] } } ] } ``` This change adds a `config` field in each policy json object: ``` { "policies": [ { "config": { "match": { "name" : "my-policy", "indices" : ["users"], "match_field" : "email", "enrich_fields" : [ "first_name", "last_name", "city", "zip", "state" ] } } } ] } ``` This allows us in the future to add other information about policies in the get policy api response. The UI will consume this API to build an overview of all policies. The UI may in the future include additional information about a policy and the plan is to include that in the get policy api, so that this information can be gathered in a single api call. An example of the information that is likely to be added is: * Last policy execution time * The status of a policy (executing, executed, unexecuted) * Information about the last failure if exists	2019-09-30 14:37:23 +02:00
Jason Tedor	2cba323b4e	Remove use of get raw in token/API key settings (#47260 ) These settings were using get raw to fallback to whether or not SSL is enabled. Yet, we have a formal mechanism for falling back to a setting. This commit cuts over to that formal mechanism.	2019-09-30 06:35:58 -04:00
Martijn van Groningen	66f72bcdbc	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-30 08:12:28 +02:00
Rory Hunter	53a4d2176f	Convert most awaitBusy calls to assertBusy (#45794 ) (#47112 ) Backport of #45794 to 7.x. Convert most `awaitBusy` calls to `assertBusy`, and use asserts where possible. Follows on from #28548 by @liketic. There were a small number of places where it didn't make sense to me to call `assertBusy`, so I kept the existing calls but renamed the method to `waitUntil`. This was partly to better reflect its usage, and partly so that anyone trying to add a new call to awaitBusy wouldn't be able to find it. I also didn't change the usage in `TransportStopRollupAction` as the comments state that the local awaitBusy method is a temporary copy-and-paste. Other changes: * Rework `waitForDocs` to scale its timeout. Instead of calling `assertBusy` in a loop, work out a reasonable overall timeout and await just once. * Some tests failed after switching to `assertBusy` and had to be fixed. * Correct the expect templates in AbstractUpgradeTestCase. The ES Security team confirmed that they don't use templates any more, so remove this from the expected templates. Also rewrite how the setup code checks for templates, in order to give more information. * Remove an expected ML template from XPackRestTestConstants The ML team advised that the ML tests shouldn't be waiting for any `.ml-notifications` templates, since such checks should happen in the production code instead. Also rework the template checking code in `XPackRestTestHelper` to give more helpful failure messages. * Fix issue in `DataFrameSurvivesUpgradeIT` when upgrading from < 7.4	2019-09-29 12:21:46 +01:00
Martijn van Groningen	7ffe2e7e63	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-27 14:42:11 +02:00
Andrei Dan	4c909438dd	Fix OriginationDate parsing tests. (#47170 ) (#47200 ) Drop the usage of `SimpleDateFormat` and use the `DateFormatter` instead (cherry picked from commit 7cf509a7a11ecf6c40c44c18e8f03b8e81fcd1c2) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-09-27 13:16:45 +01:00
Przemysław Witek	3fbd58d156	[7.x] Allow evaluation to consist of multiple steps. (#46653 ) (#47194 )	2019-09-27 13:01:51 +02:00
Martijn van Groningen	8a4eefdd83	Expose enrich stats api to monitoring. (#46708 ) This change also slightly modifies the stats response, so that is can easier consumer by monitoring and other users. (coordinators stats are now in a list instead of a map and has an additional field for the node id) Relates to #32789	2019-09-26 11:04:33 +02:00
Yogesh Gaikwad	9a64b7a888	[Backport] Validate `query` field when creating roles (#46275 ) (#47094 ) In the current implementation, the validation of the role query occurs at runtime when the query is being executed. This commit adds validation for the role query when creating a role but not for the template query as we do not have the runtime information required for evaluating the template query (eg. authenticated user's information). This is similar to the scripts that we store but do not evaluate or parse if they are valid queries or not. For validation, the query is evaluated (if not a template), parsed to build the QueryBuilder and verify if the query type is allowed. Closes #34252	2019-09-26 17:57:36 +10:00
Benjamin Trent	fcddaa90de	[7.x] [ML][Inference] adding tree model (#47044 ) (#47141 ) * [ML][Inference] adding tree model (#47044) * [ML][Inference] adding tree model * renaming features for updated schema * fixing 7.x compilation	2019-09-25 19:11:15 -04:00
Andrei Dan	27520cac3b	ILM: parse origination date from index name (#46755 ) (#47124 ) * ILM: parse origination date from index name (#46755) Introduce the `index.lifecycle.parse_origination_date` setting that indicates if the origination date should be parsed from the index name. If set to true an index which doesn't match the expected format (namely `indexName-{dateFormat}-optional_digits` will fail before being created. The origination date will be parsed when initialising a lifecycle for an index and it will be set as the `index.lifecycle.origination_date` for that index. A user set value for `index.lifecycle.origination_date` will always override a possible parsable date from the index name. (cherry picked from commit c363d27f0210733dad0c307d54fa224a92ddb569) Signed-off-by: Andrei Dan <andrei.dan@elastic.co> * Drop usage of Map.of to be java 8 compliant	2019-09-25 21:44:16 +01:00
Lee Hinman	a267df30fa	Wait for snapshot completion in SLM snapshot invocation (#47051 ) * Wait for snapshot completion in SLM snapshot invocation This changes the snapshots internally invoked by SLM to wait for completion. This allows us to capture more snapshotting failure scenarios. For example, previously a snapshot would be created and then registered as a "success", however, the snapshot may have been aborted, or it may have had a subset of its shards fail. These cases are now handled by inspecting the response to the `CreateSnapshotRequest` and ensuring that there are no failures. If any failures are present, the history store now stores the action as a failure instead of a success. Relates to #38461 and #43663	2019-09-25 14:25:22 -06:00
Gordon Brown	a46eef9634	Change SLM stats format (#46991 ) Using arrays of objects with embedded IDs is preferred for new APIs over using entity IDs as JSON keys. This commit changes the SLM stats API to use the preferred format.	2019-09-25 11:32:08 -06:00
Benjamin Trent	05fb7be571	[7.x] [ML][Inference] Feature pre-processing objects and functions (#46777 ) (#47040 ) * [ML][Inference] Feature pre-processing objects and functions (#46777) To support inference on pre-trained machine learning models, some basic feature encoding will be necessary. I am using a named object serialization approach so new encodings/pre-processing steps could be added in the future. This PR lays down the ground work for 3 basic encodings: * HotOne * Target Mean * Frequency More feature encodings or pre-processings could be added in the future: * Handling missing columns * Standardization * Label encoding * etc.... * fixing compilation for namedxcontent tests	2019-09-25 08:16:24 -04:00
Ioannis Kakavas	23bceaadf8	Handle RelayState in preparing a SAMLAuthN Request (#46534 ) (#47092 ) This change allows for the caller of the `saml/prepare` API to pass a `relay_state` parameter that will then be part of the redirect URL in the response as the `RelayState` query parameter. The SAML IdP is required to reflect back the value of that relay state when sending a SAML Response. The caller of the APIs can then, when receiving the SAML Response, read and consume the value as it see fit.	2019-09-25 13:23:46 +03:00
Yogesh Gaikwad	6f453aa6b2	Validate index and cluster privilege names when creating a role (#46361 ) (#47063 ) This commit adds validation so a role cannot be created with invalid index or cluster privilege name. Closes #29703	2019-09-25 18:57:11 +10:00
Hendrik Muhs	7377ac4637	[Transform] Replace transforms with transform, index constants (#47023 ) - replace "transforms" with "transform" for consistency - use constants for internal index naming wherever possible and document required changes	2019-09-25 08:31:43 +02:00
James Baiera	a349b22273	Add the cluster version to enrich policies (#45021 ) Adds the Elasticsearch version as a field on the EnrichPolicy object	2019-09-23 18:44:45 -04:00
Julie Tibshirani	9124c94a6c	Add support for aliases in queries on _index. (#46944 ) Previously, queries on the _index field were not able to specify index aliases. This was a regression in functionality compared to the 'indices' query that was deprecated and removed in 6.0. Now queries on _index can specify an alias, which is resolved to the concrete index names when we check whether an index matches. To match a remote shard target, the pattern needs to be of the form 'cluster:index' to match the fully-qualified index name. Index aliases can be specified in the following query types: term, terms, prefix, and wildcard.	2019-09-23 13:21:37 -07:00
Martijn van Groningen	0cfddca61d	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-23 09:46:05 +02:00
Michael Basnight	f1c7ed647b	Allow comma separated ids in get enrich policy API (#46351 ) This commit changes the GET REST api so it will accept an optional comma separated list of enrich policy ids. This change also modifies the behavior of the GET API in that it will not error if it is passed a bad enrich id anymore, but will instead just return an empty list.	2019-09-20 10:06:58 -05:00
Hendrik Muhs	4a2cb05162	add message about transform disabled if license is missing (#46901 ) adds a message for transform about what happens if no license has been activated	2019-09-20 13:47:40 +02:00
Hendrik Muhs	abe889af75	[7.5][Transform] rename classes in transform plugin (#46867 ) rename classes and settings in transform plugin, provide BWC for old settings	2019-09-20 10:43:00 +02:00
Dimitris Athanasiou	02a5e153dc	[7.x][ML] Parse and index data frame analytics state (#46804 ) (#46820 ) This commit reuses the same state processor that is used for autodetect to parse state output from data frame analytics jobs. We then index the state document into the state index. Backport of #46804	2019-09-18 20:37:40 +03:00
Benjamin Trent	9cf9c64ec2	[7.x] [ML][Transforms] remove `force` flag from _start (#46414 ) (#46748 ) * [ML][Transforms] remove `force` flag from _start (#46414) * [ML][Transforms] remove `force` flag from _start * fixing expected error message * adjusting bwc version	2019-09-18 10:06:05 -04:00
Lee Hinman	b85468d6ea	Add node setting for disabling SLM (#46794 ) (#46796 ) This adds the `xpack.slm.enabled` setting to allow disabling of SLM functionality as well as its HTTP API endpoints. Relates to #38461	2019-09-17 17:39:41 -06:00
Oliver Gupte	cbd58d3b78	Give kibana user privileges to create APM agent config index (#46765 ) (#46792 ) * Give kibana user reserved role privileges on .apm-* to create APM agent configuration index. * fixed test to include checking all .apm-* permissions * changed pattern from ".apm-*" to the more specific ".apm-agent-configuration"	2019-09-17 15:01:42 -07:00
Armin Braun	b0f09b279f	Make Snapshot Logic Write Metadata after Segments (#45689 ) (#46764 ) * Write metadata during snapshot finalization after segment files to prevent outdated metadata in case of dynamic mapping updates as explained in #41581 * Keep the old behavior of writing the metadata beforehand in the case of mixed version clusters for BwC reasons * Still overwrite the metadata in the end, so even a mixed version cluster is fixed by this change if a newer version master does the finalization * Fixes #41581	2019-09-17 13:09:39 +02:00
Przemysław Witek	e49be611ad	[7.x] Add audit messages for Data Frame Analytics (#46521 ) (#46738 )	2019-09-16 21:21:38 +02:00
Hendrik Muhs	c8f52ec4ff	[Transform] Rename data frame plugin to transform: classes in xpack.core (#46644 ) (#46734 ) rename classes in xpack.core of transform plugin from "data frame transform" to "transform"	2019-09-16 13:39:22 +02:00
Andrei Dan	c57cca98b2	[ILM] Add date setting to calculate index age (#46561 ) (#46697 ) * [ILM] Add date setting to calculate index age Add the `index.lifecycle.origination_date` to allow users to configure a custom date that'll be used to calculate the index age for the phase transmissions (as opposed to the default index creation date). This could be useful for users to create an index with an "older" origination date when indexing old data. Relates to #42449. * [ILM] Don't override creation date on policy init The initial approach we took was to override the lifecycle creation date if the `index.lifecycle.origination_date` setting was set. This had the disadvantage of the user not being able to update the `origination_date` anymore once set. This commit changes the way we makes use of the `index.lifecycle.origination_date` setting by checking its value when we calculate the index age (ie. at "read time") and, in case it's not set, default to the index creation date. * Make origination date setting index scope dynamic * Document orignation date setting in ilm settings (cherry picked from commit d5bd2bb77ee28c1978ab6679f941d7c02e389d32) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-09-16 08:50:28 +01:00
Luca Cavanna	e57756492a	Update http-core and http-client dependencies (#46549 ) Relates to #45808 Closes #45577	2019-09-12 09:45:29 +02:00
James Rodewig	f9bf10f2b6	[DOCS] Change "a SSL" to "an SSL" in the Java docs (#46524 ) (#46618 )	2019-09-11 15:55:57 -04:00
Lee Hinman	09a9cefaa0	Handle partial failure retrieving segments in SegmentCountStep (#46556 ) Since the `IndicesSegmentsRequest` scatters to all shards for the index, it's possible that some of the shards may fail. This adds failure handling and logging (since this is a best-effort step in the first place) for this case.	2019-09-11 10:29:31 -06:00
Jim Ferenczi	23bf310c84	Replace the SearchContext with QueryShardContext when building aggregator factories (#46527 ) This commit replaces the `SearchContext` with the `QueryShardContext` when building aggregator factories. Aggregator factories are part of the `SearchContext` so they shouldn't require a `SearchContext` to create them. The main changes here are the signatures of `AggregationBuilder#build` that now takes a `QueryShardContext` and `AggregatorFactory#createInternal` that passes the `SearchContext` to build the `Aggregator`. Relates #46523	2019-09-11 16:43:30 +02:00
Hendrik Muhs	efea581dcc	[7.x][Transform]Rename data frame plugin to transform: plugin and package names (#46583 ) rename data frame transform plugin to transform: - rename plugin data-frame to transform - change all package names from o.e..dataframe. to o.e..transform. - necessary changes to fix loading/testing	2019-09-11 14:50:08 +02:00
Armin Braun	41633cb9b5	More Efficient Ordering of Shard Upload Execution (#42791 ) (#46588 ) * More Efficient Ordering of Shard Upload Execution (#42791) * Change the upload order of of snapshots to work file by file in parallel on the snapshot pool instead of merely shard-by-shard * Inspired by #39657 * Cleanup BlobStoreRepository Abort and Failure Handling (#46208)	2019-09-11 13:59:20 +02:00
Martijn van Groningen	a4b0f66919	Add enrich stats api (#46462 ) The enrich api returns enrich coordinator stats and information about currently executing enrich policies. The coordinator stats include per ingest node: * The current number of search requests in the queue. * The total number of outstanding remote requests that have been executed since node startup. Each remote request is likely to include multiple search requests. This depends on how much search requests are in the queue at the time when the remote request is performed. * The number of current outstanding remote requests. * The total number of search requests that `enrich` processors have executed since node startup. The current execution policies stats include: * The name of policy that is executing * A full blow task info object that is executing the policy. Relates to #32789	2019-09-11 13:40:24 +02:00
Jim Ferenczi	425b1a77e8	Add more context to QueryShardContext (#46584 ) This change adds an IndexSearcher and the node's BigArrays in the QueryShardContext. It's a spin off of #46527 as this change is required to allow aggregation builder to solely use the query shard context. Relates #46523	2019-09-11 12:24:51 +02:00
Przemysław Witek	e38e631dac	[7.x] Implement DataFrameAnalyticsAuditMessage and DataFrameAnalyticsAuditor (#45967 ) (#46519 )	2019-09-11 12:17:26 +02:00
Lee Hinman	cdc3a260af	Add retention to Snapshot Lifecycle Management (backport of #4… (#46506 ) * Add retention to Snapshot Lifecycle Management (#46407) This commit adds retention to the existing Snapshot Lifecycle Management feature (#38461) as described in #43663. This allows a user to configure SLM to automatically delete older snapshots based on a number of criteria. An example policy would look like: ``` PUT /_slm/policy/snapshot-every-day { "schedule": "0 30 2 * * ?", "name": "<production-snap-{now/d}>", "repository": "my-s3-repository", "config": { "indices": ["foo-", "important"] }, // Newly configured retention options "retention": { // Snapshots should be deleted after 14 days "expire_after": "14d", // Keep a maximum of thirty snapshots "max_count": 30, // Keep a minimum of the four most recent snapshots "min_count": 4 } } ``` SLM Retention is run on a scheduled configurable with the `slm.retention_schedule` setting, which supports cron expressions. Deletions are run for a configurable time bounded by the `slm.retention_duration` setting, which defaults to 1 hour. Included in this work is a new SLM stats API endpoint available through ``` json GET /_slm/stats ``` That returns statistics about snapshot taken and deleted, as well as successful retention runs, failures, and the time spent deleting snapshots. #45362 has more information as well as an example of the output. These stats are also included when retrieving SLM policies via the API. Add base framework for snapshot retention (#43605) * Add base framework for snapshot retention This adds a basic `SnapshotRetentionService` and `SnapshotRetentionTask` to start as the basis for SLM's retention implementation. Relates to #38461 * Remove extraneous 'public' * Use a local var instead of reading class var repeatedly * Add SnapshotRetentionConfiguration for retention configuration (#43777) * Add SnapshotRetentionConfiguration for retention configuration This commit adds the `SnapshotRetentionConfiguration` class and its HLRC counterpart to encapsulate the configuration for SLM retention. Currently only a single parameter is supported as an example (we still need to discuss the different options we want to support and their names) to keep the size of the PR down. It also does not yet include version serialization checks since the original SLM branch has not yet been merged. Relates to #43663 * Fix REST tests * Fix more documentation * Use Objects.equals to avoid NPE * Put `randomSnapshotLifecyclePolicy` in only one place * Occasionally return retention with no configuration * Implement SnapshotRetentionTask's snapshot filtering and delet… (#44764) * Implement SnapshotRetentionTask's snapshot filtering and deletion This commit implements the snapshot filtering and deletion for `SnapshotRetentionTask`. Currently only the expire-after age is used for determining whether a snapshot is eligible for deletion. Relates to #43663 * Fix deletes running on the wrong thread * Handle missing or null policy in snap metadata differently * Convert Tuple<String, List<SnapshotInfo>> to Map<String, List<SnapshotInfo>> * Use the `OriginSettingClient` to work with security, enhance logging * Prevent NPE in test by mocking Client * Allow empty/missing SLM retention configuration (#45018) Semi-related to #44465, this allows the `"retention"` configuration map to be missing. Relates to #43663 * Add min_count and max_count as SLM retention predicates (#44926) This adds the configuration options for `min_count` and `max_count` as well as the logic for determining whether a snapshot meets this criteria to SLM's retention feature. These options are optional and one, two, or all three can be specified in an SLM policy. Relates to #43663 * Time-bound deletion of snapshots in retention delete function (#45065) * Time-bound deletion of snapshots in retention delete function With a cluster that has a large number of snapshots, it's possible that snapshot deletion can take a very long time (especially since deletes currently have to happen in a serial fashion). To prevent snapshot deletion from taking forever in a cluster and blocking other operations, this commit adds a setting to allow configuring a maximum time to spend deletion snapshots during retention. This dynamic setting defaults to 1 hour and is best-effort, meaning that it doesn't hard stop a deletion at an hour mark, but ensures that once the time has passed, all subsequent deletions are deferred until the next retention cycle. Relates to #43663 * Wow snapshots suuuure can take a long time. * Use a LongSupplier instead of actually sleeping * Remove TestLogging annotation * Remove rate limiting * Add SLM metrics gathering and endpoint (#45362) * Add SLM metrics gathering and endpoint This commit adds the infrastructure to gather metrics about the different SLM actions that a cluster takes. These actions are stored in `SnapshotLifecycleStats` and perpetuated in cluster state. The stats stored include the number of snapshots taken, failed, deleted, the number of retention runs, as well as per-policy counts for snapshots taken, failed, and deleted. It also includes the amount of time spent deleting snapshots from SLM retention. This commit also adds an endpoint for retrieving all stats (further commits will expose this in the SLM get-policy API) that looks like: ``` GET /_slm/stats { "retention_runs" : 13, "retention_failed" : 0, "retention_timed_out" : 0, "retention_deletion_time" : "1.4s", "retention_deletion_time_millis" : 1404, "policy_metrics" : { "daily-snapshots2" : { "snapshots_taken" : 7, "snapshots_failed" : 0, "snapshots_deleted" : 6, "snapshot_deletion_failures" : 0 }, "daily-snapshots" : { "snapshots_taken" : 12, "snapshots_failed" : 0, "snapshots_deleted" : 12, "snapshot_deletion_failures" : 6 } }, "total_snapshots_taken" : 19, "total_snapshots_failed" : 0, "total_snapshots_deleted" : 18, "total_snapshot_deletion_failures" : 6 } ``` This does not yet include HLRC for this, as this commit is quite large on its own. That will be added in a subsequent commit. Relates to #43663 * Version qualify serialization * Initialize counters outside constructor * Use computeIfAbsent instead of being too verbose * Move part of XContent generation into subclass * Fix REST action for master merge * Unused import * Record history of SLM retention actions (#45513) This commit records the deletion of snapshots by the retention component of SLM into the SLM history index for the purposes of reviewing operations taken by SLM and alerting. * Retry SLM retention after currently running snapshot completes (#45802) * Retry SLM retention after currently running snapshot completes This commit adds a ClusterStateObserver to wait until the currently running snapshot is complete before proceeding with snapshot deletion. SLM retention waits for the maximum allowed deletion time for the snapshot to complete, however, the waiting time is not factored into the limit on actual deletions. Relates to #43663 * Increase timeout waiting for snapshot completion * Apply patch From `2374316f0d`.patch * Rename test variables * [TEST] Be less strict for stats checking * Skip SLM retention if ILM is STOPPING or STOPPED (#45869) This adds a check to ensure we take no action during SLM retention if ILM is currently stopped or in the process of stopping. Relates to #43663 * Check all actions preventing snapshot delete during retention (#45992) * Check all actions preventing snapshot delete during retention run Previously we only checked to see if a snapshot was currently running, but it turns out that more things can block snapshot deletion. This changes the check to be a check for: - a snapshot currently running - a deletion already in progress - a repo cleanup in progress - a restore currently running This was found by CI where a third party delete in a test caused SLM retention deletion to throw an exception. Relates to #43663 * Add unit test for okayToDeleteSnapshots * Fix bug where SLM retention task would be scheduled on every node * Enhance test logging * Ignore if snapshot is already deleted * Missing import * Fix SnapshotRetentionServiceTests * Expose SLM policy stats in get SLM policy API (#45989) This also adds support for the SLM stats endpoint to the high level rest client. Retrieving a policy now looks like: ```json { "daily-snapshots" : { "version": 1, "modified_date": "2019-04-23T01:30:00.000Z", "modified_date_millis": 1556048137314, "policy" : { "schedule": "0 30 1 * * ?", "name": "<daily-snap-{now/d}>", "repository": "my_repository", "config": { "indices": ["data-", "important"], "ignore_unavailable": false, "include_global_state": false }, "retention": {} }, "stats": { "snapshots_taken": 0, "snapshots_failed": 0, "snapshots_deleted": 0, "snapshot_deletion_failures": 0 }, "next_execution": "2019-04-24T01:30:00.000Z", "next_execution_millis": 1556048160000 } } ``` Relates to #43663 Rewrite SnapshotLifecycleIT as as ESIntegTestCase (#46356) * Rewrite SnapshotLifecycleIT as as ESIntegTestCase This commit splits `SnapshotLifecycleIT` into two different tests. `SnapshotLifecycleRestIT` which includes the tests that do not require slow repositories, and `SLMSnapshotBlockingIntegTests` which is now an integration test using `MockRepository` to simulate a snapshot being in progress. Relates to #43663 Resolves #46205 * Add error logging when exceptions are thrown * Update serialization versions * Fix type inference * Use non-Cancellable HLRC return value * Fix Client mocking in test * Fix SLMSnapshotBlockingIntegTests for 7.x branch * Update SnapshotRetentionTask for non-multi-repo snapshot retrieval * Add serialization guards for SnapshotLifecyclePolicy	2019-09-10 09:08:09 -06:00
Michael Basnight	9304f5c889	Ensure enrich executes on master node only (#46448 ) The previous transport action was a read action, which under the right set of circumstances can execute on a coordinating node. This commit ensures that cannot happen.	2019-09-10 09:59:36 -05:00
Martijn van Groningen	c057fce978	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-09 08:40:54 +02:00
Benjamin Trent	457ff3e2fb	7.x/ml fix instance serialization bwc (#46404 ) * [ML] Fixing instance serialization version for bwc * fixing CppLogMessage	2019-09-05 13:23:26 -05:00
Martijn van Groningen	ded98e50b7	Change exact match processor to match processor. (#46041 ) Besides a rename, this changes allows to processor to attach multiple enrich docs to the document being ingested. Also in order to control the maximum number of enrich docs to be included in the document being ingested, the `max_matches` setting is added to the enrich processor. Relates #32789	2019-09-04 18:05:12 +02:00
Andrey Ershov	ece9eb4acd	Remove stack trace logging in Security(Transport\|Http)ExceptionHandler (#45966 ) As per #45852 comment we no longer need to log stack-traces in SecurityTransportExceptionHandler and SecurityHttpExceptionHandler even if trace logging is enabled. (cherry picked from commit c99224a32d26db985053b7b36e2049036e438f97)	2019-09-04 11:50:35 +03:00
Benjamin Trent	53df54c703	[ML][Transforms] fixing stop on changes check bug (#46162 ) (#46273 ) * [ML][Transforms] fixing stop on changes check bug * Adding new method finishAndCheckState to cover race conditions in early terminations * changing stopping conditions in `onStart` * allow indexer to finish when exiting early	2019-09-03 11:04:18 -05:00
Lee Hinman	3d4b8e01c7	Validate SLM policy ids strictly (#45998 ) (#46145 ) This uses strict validation for SLM policy ids, similar to what we use for index names. Resolves #45997	2019-09-03 09:20:02 -06:00
David Roberts	ab045744ac	[ML-DataFrame] Fix off-by-one error in checkpoint operations_behind (#46235 ) Fixes a problem where operations_behind would be one less than expected per shard in a new index matched by the data frame transform source pattern. For example, if a data frame transform had a source of foo* and a new index foo-new was created with 2 shards and 7 documents indexed in it then operations_behind would be 5 prior to this change. The problem was that an empty index has a global checkpoint number of -1 and the sequence number of the first document that is indexed into an index is 0, not 1. This doesn't matter for indices included in both the last and next checkpoints, as the off-by-one errors cancelled, but for a new index it affected the observed result.	2019-09-03 12:45:02 +01:00
Martijn van Groningen	555b630160	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-02 09:16:55 +02:00
Benjamin Trent	d0c5573a51	[ML] Throw an error when a datafeed needs CCS but it is not enabled for the node (#46044 ) (#46096 ) Though we allow CCS within datafeeds, users could prevent nodes from accessing remote clusters. This can cause mysterious errors and difficult to troubleshoot. This commit adds a check to verify that `cluster.remote.connect` is enabled on the current node when a datafeed is configured with a remote index pattern.	2019-08-30 09:27:07 -05:00
Dimitris Athanasiou	5921ae53d8	[7.x][ML] Regression dependent variable must be numeric (#46072 ) (#46136 ) * [ML] Regression dependent variable must be numeric This adds a validation that the dependent variable of a regression analysis must be numeric. * Address review comments and fix some problems In addition to addressing the review comments, this commit fixes a few issues I found during testing. In particular: - if there were mappings for required fields but they were not included we were not reporting the error - if explicitly included fields had unsupported types we were not reporting the error Unfortunately, I couldn't get those fixed without refactoring the code in `ExtractedFieldsDetector`.	2019-08-30 09:57:43 +03:00
Zachary Tong	cf8a4171e1	Rename `data-science` plugin to `analytics` (#46133 ) Rename `data-science` plugin to `analytics`. Also removes enabled flag. Backport of #46092	2019-08-29 12:45:39 -04:00
Michael Basnight	51a703da29	Add enrich transport client support (#46002 ) This commit adds an enrich client, as well as a smoke test to validate the client works.	2019-08-29 09:10:07 -05:00
Przemysław Witek	b8a0379057	Refactor auditor-related classes (#45893 ) (#46120 )	2019-08-29 14:21:03 +02:00
Gordon Brown	47bbd9d9a9	[7.x] Fix rollover alias in SLM history index template (#46001 ) This commit adds the `rollover_alias` setting required for ILM to work correctly to the SLM history index template and adds assertions to the SLM integration tests to ensure that it works correctly.	2019-08-28 14:50:22 -07:00
Martijn van Groningen	1157224a6b	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-28 10:14:07 +02:00
Jake Landis	154d1dd962	Watcher max_iterations with foreach action execution (#45715 ) (#46039 ) Prior to this commit the foreach action execution had a hard coded limit to 100 iterations. This commit allows the max number of iterations to be a configuration ('max_iterations') on the foreach action. The default remains 100.	2019-08-27 16:57:20 -05:00
Armin Braun	fdef293c81	Fix RegressionTests#fromXContent (#46029 ) * The `trainingPercent` must be between `1` and `100`, not `0` and `100` which is causing test failures	2019-08-27 18:24:26 +03:00
Dimitris Athanasiou	873ad3f942	[7.x][ML] Add option to regression to randomize training set (#45969 ) (#46017 ) Adds a parameter `training_percent` to regression. The default value is `100`. When the parameter is set to a value less than `100`, from the rows that can be used for training (ie. those that have a value for the dependent variable) we randomly choose whether to actually use for training. This enables splitting the data into a training set and the rest, usually called testing, validation or holdout set, which allows for validating the model on data that have not been used for training. Technically, the analytics process considers as training the data that have a value for the dependent variable. Thus, when we decide a training row is not going to be used for training, we simply clear the row's dependent variable.	2019-08-27 17:53:11 +03:00
Yogesh Gaikwad	7b6246ec67	Add `manage_own_api_key` cluster privilege (#45897 ) (#46023 ) The existing privilege model for API keys with privileges like `manage_api_key`, `manage_security` etc. are too permissive and we would want finer-grained control over the cluster privileges for API keys. Previously APIs created would also need these privileges to get its own information. This commit adds support for `manage_own_api_key` cluster privilege which only allows api key cluster actions on API keys owned by the currently authenticated user. Also adds support for retrieval of the API key self-information when authenticating via API key without the need for the additional API key privileges. To support this privilege, we are introducing additional authentication context along with the request context such that it can be used to authorize cluster actions based on the current user authentication. The API key get and invalidate APIs introduce an `owner` flag that can be set to true if the API key request (Get or Invalidate) is for the API keys owned by the currently authenticated user only. In that case, `realm` and `username` cannot be set as they are assumed to be the currently authenticated ones. The changes cover HLRC changes, documentation for the API changes. Closes #40031	2019-08-28 00:44:23 +10:00
Dimitris Athanasiou	dd6c13fdf9	[ML] Add description to DF analytics (#45774 ) (#46019 )	2019-08-27 15:48:59 +03:00
Albert Zaharovits	1ebee5bf9b	PKI realm authentication delegation (#45906 ) This commit introduces PKI realm delegation. This feature supports the PKI authentication feature in Kibana. In essence, this creates a new API endpoint which Kibana must call to authenticate clients that use certificates in their TLS connection to Kibana. The API call passes to Elasticsearch the client's certificate chain. The response contains an access token to be further used to authenticate as the client. The client's certificates are validated by the PKI realms that have been explicitly configured to permit certificates from the proxy (Kibana). The user calling the delegation API must have the delegate_pki privilege. Closes #34396	2019-08-27 14:42:46 +03:00
Zachary Tong	943a016bb2	Add Cumulative Cardinality agg (and Data Science plugin) (#45990 ) This adds a pipeline aggregation that calculates the cumulative cardinality of a field. It does this by iteratively merging in the HLL sketch from consecutive buckets and emitting the cardinality up to that point. This is useful for things like finding the total "new" users that have visited a website (as opposed to "repeat" visitors). This is a Basic+ aggregation and adds a new Data Science plugin to house it and future advanced analytics/data science aggregations.	2019-08-26 16:19:55 -04:00
Andrey Ershov	479ab9b8db	Fix plaintext on TLS port logging (#45852 ) Today if non-TLS record is received on TLS port generic exception will be logged with the stack-trace. SSLExceptionHelper.isNotSslRecordException method does not work because it's assuming that NonSslRecordException would be top-level. This commit addresses the issue and the log would be more concise. (cherry picked from commit 6b83527bf0c23d4d5b97fab7f290c43432945d4f)	2019-08-26 12:32:35 +02:00
Ioannis Kakavas	2bee27dd54	Allow Transport Actions to indicate authN realm (#45946 ) This commit allows the Transport Actions for the SSO realms to indicate the realm that should be used to authenticate the constructed AuthenticationToken. This is useful in the case that many authentication realms of the same type have been configured and where the caller of the API(Kibana or a custom web app) already know which realm should be used so there is no need to iterate all the realms of the same type. The realm parameter is added in the relevant REST APIs as optional so as not to introduce any breaking change.	2019-08-25 19:36:41 +03:00
Dimitris Athanasiou	be554fe5f0	[7.x][ML] Improve progress reportings for DF analytics (#45856 ) (#45910 ) Previously, the stats API reports a progress percentage for DF analytics tasks that are running and are in the `reindexing` or `analyzing` state. This means that when the task is `stopped` there is no progress reported. Thus, one cannot distinguish between a task that never run to one that completed. In addition, there are blind spots in the progress reporting. In particular, we do not account for when data is loaded into the process. We also do not account for when results are written. This commit addresses the above issues. It changes progress to being a list of objects, each one describing the phase and its progress as a percentage. We currently have 4 phases: reindexing, loading_data, analyzing, writing_results. When the task stops, progress is persisted as a document in the state index. The stats API now reports progress from in-memory if the task is running, or returns the persisted document (if there is one).	2019-08-23 23:04:39 +03:00
Martijn van Groningen	cb42e19a32	Change how type is stored in an enrich policy. (#45789 ) A policy type controls how the enrich index is created and the query executed against the match field. Currently there is a single policy type (`exact_match`). In the near future more policy types will be added and different policy may have different configuration options. For this reason type should be a json object instead of a string field: ``` { "exact_match": { ... } } ``` instead of: ``` { "type": "exact_match", ... } ``` This will make streaming parsing of enrich policies easier as in the new format, the parsing code can know ahead what configuration fields to expect. In the latter format that is not possible if the type field appears not as the first field. Relates to #32789	2019-08-23 13:43:38 +02:00
Martijn van Groningen	837cfa2640	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-23 11:22:27 +02:00
markharwood	217e41ab6c	Search - added HLRC support for PinnedQueryBuilder (#45779 ) (#45853 ) Added HLRC support for PinnedQueryBuilder Related #44074	2019-08-23 09:22:17 +01:00
Tim Vernum	f94e4a9151	Set security index refresh interval to 1s (#45888 ) The security indices were being created without specifying the refresh interval, which means it would inherit a value from any templates that exists. However, certain security functionality depends on being able to wait_for refresh, and causes errors (e.g. in Kibana) if that time exceeds 30s. This commit changes the security indices configuration to always be created with a 1s refresh interval. This prevents any templates from inadvertantly interfering with the proper functioning of security. It is possible for an administrator to explicitly change the refresh interval after the indices have been created. Backport of: #45434	2019-08-23 12:41:37 +10:00
Tim Vernum	029725fc35	Add SSL/TLS settings for watcher email (#45836 ) This change adds a new SSL context xpack.notification.email.ssl.* that supports the standard SSL configuration settings (truststore, verification_mode, etc). This SSL context is used when configuring outbound SMTP properties for watcher email notifications. Backport of: #45272	2019-08-23 10:13:51 +10:00
Benjamin Trent	8e3c54fff7	[7.x] [ML] Adding data frame analytics stats to _usage API (#45820 ) (#45872 ) * [ML] Adding data frame analytics stats to _usage API (#45820) * [ML] Adding data frame analytics stats to _usage API * making the size of analytics stats 10k * adjusting backport	2019-08-22 15:15:41 -05:00
Benjamin Trent	e50a78cf50	[ML-DataFrame] version data frame transform internal index (#45375 ) (#45837 ) Adds index versioning for the internal data frame transform index. Allows for new indices to be created and referenced, `GET` requests now query over the index pattern and takes the latest doc (based on INDEX name).	2019-08-22 11:46:30 -05:00
Przemysław Witek	7512337922	[7.x] Allow the user to specify 'query' in Evaluate Data Frame request (#45775 ) (#45825 )	2019-08-22 11:14:26 +02:00
Gordon Brown	47b1e2b3d0	[7.x] Use rollover for SLM's history indices (#45686 ) Following our own guidelines, SLM should use rollover instead of purely time-based indices to keep shard counts low. This commit implements lazy index creation for SLM's history indices, indexing via an alias, and rollover in the built-in ILM policy.	2019-08-21 13:42:11 -06:00
Martijn van Groningen	2677ac14d2	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-21 14:28:17 +02:00
Przemysław Witek	bf701b83d2	Shorten field names in EstimateMemoryUsageResponse (#45719 ) (#45772 )	2019-08-21 12:45:09 +02:00
Dimitris Athanasiou	d5c3d9b50f	[7.x][ML] Do not skip rows with missing values for regression (#45751 ) (#45754 ) Regression analysis support missing fields. Even more, it is expected that the dependent variable has missing fields to the part of the data frame that is not for training. This commit allows to declare that an analysis supports missing values. For such analysis, rows with missing values are not skipped. Instead, they are written as normal with empty strings used for the missing values. This also contains a fix to the integration test. Closes #45425	2019-08-21 08:15:38 +03:00
Benjamin Trent	ba7b677618	[ML] better handle empty results when evaluating regression (#45745 ) (#45759 ) * [ML] better handle empty results when evaluating regression * adding new failure test to ml_security black list * fixing equality check for regression results	2019-08-20 17:37:04 -05:00
Michael Basnight	e3373d349b	Consolidate enrich list all and get by name APIs (#45705 ) The get and list APIs are a single API in this commit. Whether requesting one named policy or all policies, a list of policies is returened. The list API code has all been removed and the GET api is what remains, which contains much of the list response code.	2019-08-20 10:29:59 -05:00
Przemysław Witek	b37ebd1adf	Prepare the codebase for new Auditor subclasses (#45716 ) (#45731 )	2019-08-20 16:03:50 +02:00
Przemysław Witek	80dd0a0948	Get rid of EstimateMemoryUsageRequest and EstimateMemoryUsageAction.Request. (#45718 ) (#45725 )	2019-08-20 15:49:17 +02:00
Benjamin Trent	88641a08af	[ML][Data frame] fixing failure state transitions and race condition (#45627 ) (#45656 ) * [ML][Data frame] fixing failure state transitions and race condition (#45627) There is a small window for a race condition while we are flagging a task as failed. Here are the steps where the race condition occurs: 1. A failure occurs 2. Before `AsyncTwoPhaseIndexer` calls the `onFailure` handler it does the following: a. `finishAndSetState()` which sets the IndexerState to STARTED b. `doSaveState(...)` which attempts to save the current state of the indexer 3. Another trigger is fired BEFORE `onFailure` can fire, but AFTER `finishAndSetState()` occurs. The trick here is that we will eventually set the indexer to failed, but possibly not before another trigger had the opportunity to fire. This could obviously cause some weird state interactions. To combat this, I have put in some predicates to verify the state before taking actions. This is so if state is indeed marked failed, the "second trigger" stops ASAP. Additionally, I move the task state checks INTO the `start` and `stop` methods, which will now require a `force` parameter. `start`, `stop`, `trigger` and `markAsFailed` are all `synchronized`. This should gives us some guarantees that one will not switch states out from underneath another. I also flag the task as `failed` BEFORE we successfully write it to cluster state, this is to allow us to make the task fail more quickly. But, this does add the behavior where the task is "failed" but the cluster state does not indicate as much. Adding the checks in `start` and `stop` will handle this "real state vs cluster state" race condition. This has always been a problem for `_stop` as it is not a master node action and doesn’t always have the latest cluster state. closes #45609 Relates to #45562 * [ML][Data Frame] moves failure state transition for MT safety (#45676) * [ML][Data Frame] moves failure state transition for MT safety * removing unused imports	2019-08-20 07:30:17 -05:00
Martijn van Groningen	5ea0985711	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-16 09:47:11 +02:00
Benjamin Trent	fde5dae387	[ML][Data Frame] adjusting change detection workflow (#45511 ) (#45580 ) * [ML][Data Frame] adjusting change detection workflow * adjusting for PR comment * disallowing null as an argument value	2019-08-14 17:26:24 -05:00
Nick Knize	647a8308c3	[SPATIAL] Backport new ShapeFieldMapper and ShapeQueryBuilder to 7x (#45363 ) * Introduce Spatial Plugin (#44389) Introduce a skeleton Spatial plugin that holds new licensed features coming to Geo/Spatial land! * [GEO] Refactor DeprecatedParameters in AbstractGeometryFieldMapper (#44923) Refactor DeprecatedParameters specific to legacy geo_shape out of AbstractGeometryFieldMapper.TypeParser#parse. * [SPATIAL] New ShapeFieldMapper for indexing cartesian geometries (#44980) Add a new ShapeFieldMapper to the xpack spatial module for indexing arbitrary cartesian geometries using a new field type called shape. The indexing approach leverages lucene's new XYShape field type which is backed by BKD in the same manner as LatLonShape but without the WGS84 latitude longitude restrictions. The new field mapper builds on and extends the refactoring effort in AbstractGeometryFieldMapper and accepts shapes in either GeoJSON or WKT format (both of which support non geospatial geometries). Tests are provided in the ShapeFieldMapperTest class in the same manner as GeoShapeFieldMapperTests and LegacyGeoShapeFieldMapperTests. Documentation for how to use the new field type and what parameters are accepted is included. The QueryBuilder for searching indexed shapes is provided in a separate commit. * [SPATIAL] New ShapeQueryBuilder for querying indexed cartesian geometry (#45108) Add a new ShapeQueryBuilder to the xpack spatial module for querying arbitrary Cartesian geometries indexed using the new shape field type. The query builder extends AbstractGeometryQueryBuilder and leverages the ShapeQueryProcessor added in the previous field mapper commit. Tests are provided in ShapeQueryTests in the same manner as GeoShapeQueryTests and docs are updated to explain how the query works.	2019-08-14 16:35:10 -05:00
Benjamin Trent	0c343d8443	[7.x] [ML][Transforms] adjusting stats.progress for cont. transforms (#45361 ) (#45551 ) * [ML][Transforms] adjusting stats.progress for cont. transforms (#45361) * [ML][Transforms] adjusting stats.progress for cont. transforms * addressing PR comments * rename fix * Adjusting bwc serialization versions	2019-08-14 13:08:27 -05:00
Martijn van Groningen	43b8ab607d	Improve naming of enrich policy fields. (#45494 ) Renamed `enrich_key` to `match_field` and renamed `enrich_values` to `enrich_fields`. Relates #32789	2019-08-14 11:45:22 +02:00
Przemysław Witek	df574e5168	[7.x] Implement ml/data_frame/analytics/_estimate_memory_usage API endpoint (#45188 ) (#45510 )	2019-08-14 08:26:03 +02:00
Martijn van Groningen	0353eb9291	required changes after merging in upstream branch	2019-08-13 09:17:57 +02:00
Martijn van Groningen	1951cdf1cb	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-13 09:12:31 +02:00
Yogesh Gaikwad	471d940c44	Refactor cluster privileges and cluster permission (#45265 ) (#45442 ) The current implementations make it difficult for adding new privileges (example: a cluster privilege which is more than cluster action-based and not exposed to the security administrator). On the high level, we would like our cluster privilege either: - a named cluster privilege This corresponds to `cluster` field from the role descriptor - or a configurable cluster privilege This corresponds to the `global` field from the role-descriptor and allows a security administrator to configure them. Some of the responsibilities like the merging of action based cluster privileges are now pushed at cluster permission level. How to implement the predicate (using Automaton) is being now enforced by cluster permission. `ClusterPermission` helps in enforcing the cluster level access either by performing checks against cluster action and optionally against a request. It is a collection of one or more permission checks where if any of the checks allow access then the permission allows access to a cluster action. Implementations of cluster privilege must be able to provide information regarding the predicates to the cluster permission so that can be enforced. This is enforced by making implementations of cluster privilege aware of cluster permission builder and provide a way to specify how the permission is to be built for a given privilege. This commit renames `ConditionalClusterPrivilege` to `ConfigurableClusterPrivilege`. `ConfigurableClusterPrivilege` is a renderable cluster privilege exposed as a `global` field in role descriptor. Other than this there is a requirement where we would want to know if a cluster permission is implied by another cluster-permission (`has-privileges`). This is helpful in addressing queries related to privileges for a user. This is not just simply checking of cluster permissions since we do not have access to runtime information (like request object). This refactoring does not try to address those scenarios. Relates #44048	2019-08-13 09:06:18 +10:00
Armin Braun	a9e1402189	Remove Settings from BaseRestRequest Constructor (#45418 ) (#45429 ) * Resolving the todo, cleaning up the unused `settings` parameter * Cleaning up some other minor dead code in affected classes	2019-08-12 05:14:45 +02:00
Benjamin Trent	fac1a6f8e8	[ML][Data Frame] have DataFrameTransformConfigUpdate#apply set Version (#45391 ) (#45400 )	2019-08-09 14:32:49 -05:00
Hendrik Muhs	bf4da6c6ad	[ML-DataFrame] fix starting a batch data frame after stopping at runtime (#45340 ) (#45381 ) fix loading of next checkpoint after data frame transform has been stopped/started within one run closes #45339	2019-08-09 20:30:11 +02:00
Dimitris Athanasiou	27497ff75f	[7.x][ML] Add regression analysis to DF analytics (#45292 ) (#45388 ) This commit adds a first draft of a regression analysis to data frame analytics. There is high probability that the exact syntax might change. This commit adds the new analysis type and its parameters as well as appropriate validation. It also modifies the extractor and the fields detector to be able to handle categorical fields as regression analysis supports them.	2019-08-09 19:31:13 +03:00
David Roberts	14545f8958	[ML-DataFrame] Combine task_state and indexer_state in _stats (#45324 ) This commit replaces task_state and indexer_state in the data frame _stats output with a single top level state that combines the two. It is defined as: - failed if what's currently reported as task_state is failed - stopped if there is no persistent task - Otherwise what's currently reported as indexer_state Backport of #45276	2019-08-08 16:24:26 +01:00
Martijn van Groningen	708f856940	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-08 16:52:45 +02:00
Benjamin Trent	5db9982f71	[7.x] [ML][Data Frame] Add update transform api endpoint (#45154 ) (#45279 ) * [ML][Data Frame] Add update transform api endpoint (#45154) This adds the ability to `_update` stored data frame transforms. All mutable fields are applied when the next checkpoint starts. The exception being `description`. This PR contains all that is necessary for this addition: * HLRC * Docs * Server side	2019-08-07 10:37:35 -05:00
Lee Hinman	c7ec0b8431	Include in-progress snapshot for a policy with get SLM policy… (#45245 ) This commit adds the "in_progress" key to the SLM get policy API, returning a policy that looks like: ```json { "daily-snapshots" : { "version" : 1, "modified_date" : "2019-08-05T18:41:48.778Z", "modified_date_millis" : 1565030508778, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "0 30 1 * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-*", "important" ], "ignore_unavailable" : true, "include_global_state" : false }, "retention" : { "expire_after" : "10m" } }, "last_success" : { "snapshot_name" : "production-snap-2019.08.05-oxctmnobqye3luim4uejhg", "time_string" : "2019-08-05T18:42:23.257Z", "time" : 1565030543257 }, "next_execution" : "2019-08-06T01:30:00.000Z", "next_execution_millis" : 1565055000000, "in_progress" : { "name" : "production-snap-2019.08.05-oxctmnobqye3luim4uejhg", "uuid" : "t8Idqt6JQxiZrzp0Vt7z6g", "state" : "STARTED", "start_time" : "2019-08-05T18:42:22.998Z", "start_time_millis" : 1565030542998 } } } ``` These are only visible while the snapshot is being taken (or failed), since it reads from the cluster state rather than from the repository itself.	2019-08-07 08:29:49 -06:00
Tom Callahan	a7a419bee8	Change Ldap SDK License to LGPL-2.1 (#45116 ) We currently use the unboundid ldap SDK, which is triply licensed under GPL-2.0, LGPL-2.1, and the "UnboundID LDAP SDK Free Use License". We currently identify the license as the latter, but LGPL-2.1 is the one we should be using per our policy.	2019-08-06 16:48:09 -04:00
Zachary Tong	422aca9a5d	Fix Rollup job creation to work with templates (#43943 ) The PutJob API accidentally used an "expert" API of CreateIndexRequest. That API is semi-lenient to syntax; a type could be omitted and the request would work as expected. But if a type was omitted it would not merge with templates correctly, leading to index creation that only has the template and not the requested mappings in the request. This commit refactors the PutJob API to: - Include the type name - Use a less "expert" API in an attempt to future proof against errors - Uses an XContentBuilder instead of string replacing, removes json template	2019-08-06 10:53:44 -04:00
Christoph Büscher	3366726ad1	Enable reloading of synonym_graph filters (#45135 ) Reloading of synonym_graph filter doesn't work currently because the search time AnalysisMode doesn't get propagated to the TokenFilterFactory emitted by the graph filters getChainAwareTokenFilterFactory() method. This change fixes that. Closes #45127	2019-08-02 15:33:42 +02:00
Tim Vernum	e21d58541a	Improve errors when TLS files cannot be read (#45122 ) This change improves the exception messages that are thrown when the system cannot read TLS resources such as keystores, truststores, certificates, keys or certificate-chains (CAs). This change specifically handles: - Files that do not exist - Files that cannot be read due to file-system permissions - Files that cannot be read due to the ES security-manager Backport of: #44787	2019-08-02 12:29:43 +10:00
Tim Vernum	590777150f	Explicitly fail if a realm only exists in keystore (#45091 ) There are no realms that can be configured exclusively with secure settings. Every realm that supports secure settings also requires one or more non-secure settings. However, sometimes a node will be configured with entries in the keystore for which there is nothing in elasticsearch.yml - this may be because the realm we removed from the yml, but not deleted from the keystore, or it could be because there was a typo in the realm name which has accidentially orphaned the keystore entry. In these cases the realm building would fail, but the error would not always be clear or point to the root cause (orphaned keystore entries). RealmSettings would act as though the realm existed, but then fail because an incorrect combination of settings was provided. This change causes realm building to fail early, with an explicit message about incorrect keystore entries. Backport of: #44471	2019-08-02 12:28:59 +10:00
Martijn van Groningen	aae2f0cff2	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-01 13:38:03 +07:00
Mayya Sharipova	0c68765088	Adds usage stats for vectors (#45023 ) Example of usage: _xpack/usage "vectors": { "available": true, "enabled": true, "dense_vector_fields_count" : 1, "sparse_vector_fields_count" : 1, "dense_vector_dims_avg_count" : 100 } Backport for #44512	2019-07-31 12:32:41 -04:00
Lee Hinman	598c4e72f9	[7.x] Rename indexlifecycle to ilm and snapshotlifecycle to sl… (#44977 ) * Rename indexlifecycle to ilm and snapshotlifecycle to slm (#44917) As a followup to #44725 and #44608, which renamed the packages within the x-pack project, this renames the packages within the core x-pack project. It also renames 'snapshotlifecycle' within the HLRC to slm. * Fix one more import	2019-07-29 15:51:14 -06:00
Benjamin Trent	3b514f0dae	[ML] update Instant serialization (#44765 ) (#44954 ) * [ML] update Instant serialization * addressing PR comments * removing unused import	2019-07-29 13:06:56 -05:00
Martijn van Groningen	db49cb505e	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-07-29 14:45:10 +07:00
Gordon Brown	d4b2d21339	Add option to filter ILM explain response (#44777 ) In order to make it easier to interpret the output of the ILM Explain API, this commit adds two request parameters to that API: - `only_managed`, which causes the response to only contain indices which have `index.lifecycle.name` set - `only_errors`, which causes the response to contain only indices in an ILM error state "Error state" is defined as either being in the `ERROR` step or having `index.lifecycle.name` set to a policy that does not exist.	2019-07-26 11:57:38 -04:00
Przemysław Witek	79121ea127	[7.x] Implement exponential average search time per hour statistics. (#44683 ) (#44897 )	2019-07-26 15:56:34 +02:00
Przemysław Witek	8bb8543fdf	Treat PostDataActionResponse.DataCounts.bucketCount as incremental rather than absolute (total). (#44803 ) (#44856 )	2019-07-25 20:46:56 +02:00
James Baiera	c5528a25e6	Merge branch '7.x' into enrich-7.x	2019-07-25 13:12:56 -04:00
Hendrik Muhs	2ca6306452	do not assert on indexer state (#44854 ) remove the unreliable check for the state change fixes #44813	2019-07-25 16:39:24 +02:00
David Roberts	b2e969f4ba	[ML-DataFrame] Remove ID field from data frame indexer stats (#44848 ) This is a followup to #44350. The indexer stats used to be persisted standalone, but now are only persisted as part of a state-and-stats document. During the review of #44350 it was decided that we'll stick with this design, so there will never be a need for an indexer stats object to store its transform ID as it is stored on the enclosing document. This PR removes the indexer stats document ID. Backport of #44768	2019-07-25 15:19:32 +01:00
Przemysław Witek	53f409e5ae	Add result_type field to TimingStats and DatafeedTimingStats documents (#44812 ) (#44841 )	2019-07-25 10:11:55 +02:00
Gordon Brown	2ac54da60a	Fix swapped variables in error message (#44300 ) The alias name and index were in the incorrect order in this error message. This commit corrects the order.	2019-07-24 11:38:15 -04:00
Przemysław Witek	26da573e94	[ML] [7.x] Only emit deprecation warning if there was actual change of a datafeed's job_id. (#44755 ) * Only emit deprecation warning if there was actual change of a datafeed's job_id. * Add @Deprecated annotation to DatafeedUpdate.Builder#setJobId method	2019-07-24 10:03:25 +02:00
David Roberts	caf9411a72	[ML] Improve response format of data frame stats endpoint (#44743 ) This change adjusts the data frame transforms stats endpoint to return a structure that is easier to understand. This is a breaking change for clients of the data frame transforms stats endpoint, but the feature is in beta so stability is not guaranteed. Backport of #44350	2019-07-23 18:00:50 +01:00
Przemysław Witek	16c8e18013	Deprecate the ability to update datafeed's job_id. (#44691 ) (#44742 )	2019-07-23 14:48:56 +02:00
Jason Tedor	5878bde8dc	Rename SLM package to slm (#44608 ) This commit renames the SLM package from snapshotlifecycle to slm. We have all come to know index lifecycle management as ILM, the APIs and settings use ilm, and it would be nice of the package did too. For SLM, let's use slm for all of these including the package name from the beginning.	2019-07-23 07:35:06 +09:00
Benjamin Trent	4456850a8e	[7.x] [ML][Data Frame] Add optional defer_validation param to PUT (#44455 ) (#44697 ) * [ML][Data Frame] Add optional defer_validation param to PUT (#44455) * [ML][Data Frame] Add optional defer_validation param to PUT * addressing PR comments * reverting bad replace * addressing pr comments * Update put-transform.asciidoc * Update put-transform.asciidoc * Update put-transform.asciidoc * adjusting for backport * fixing imports * [DOCS] Fixes formatting in create data frame transform API	2019-07-22 15:12:55 -05:00
James Baiera	fc20264b99	Add Enrich index background task to cleanup old indices (#43746 ) This PR adds a background maintenance task that is scheduled on the master node only. The deletion of an index is based on if it is not linked to a policy or if the enrich alias is not currently pointing at it. Synchronization has been added to make sure that no policy executions are running at the time of cleanup, and if any executions do occur, the marking process delays cleanup until next run.	2019-07-22 14:41:22 -04:00
Benjamin Trent	06e21f7902	[7.x] [ML][Data Frame] adding force delete (#44590 ) (#44696 ) * [ML][Data Frame] adding force delete (#44590) * [ML][Data Frame] adding force delete * Update delete-transform.asciidoc * adjusting for backport	2019-07-22 13:13:25 -05:00
Hendrik Muhs	4387d81e5b	add a test to check excecution flow (#44481 ) add a test for the execution flow of a2p indexer	2019-07-22 17:07:11 +02:00
Tal Levy	7c84636029	Remove StreamOutput #writeOptionalStreamable and #writeStreamableList (#44602 ) (#44643 ) remove usages of writeOptionalStreamable and writeStreambaleList relates #34389.	2019-07-19 15:55:53 -07:00
Benjamin Trent	2e303fc5f7	[ML][Data Frame] adding dynamic cluster setting for failure retries (#44577 ) (#44639 ) This adds a new dynamic cluster setting `xpack.data_frame.num_transform_failure_retries`. This setting indicates how many times non-critical failures should be retried before a data frame transform is marked as failed and should stop executing. At the time of this commit; Min: 0, Max: 100, Default: 10	2019-07-19 16:17:39 -05:00
Ryan Ernst	f193d14764	Convert remaining Action Response/Request to writeable.reader (#44528 ) (#44607 ) This commit converts readFrom to ctor with StreamInput on the remaining ActionResponse and ActionRequest classes. relates #34389	2019-07-19 13:33:38 -07:00
Christoph Büscher	eafe54c81c	Fix AnalysisMode propagation in NamedAnalyzer (#44626 ) NamedAnalyzer should return the same AnalysisMode than any custom analyzer it wraps, otherwise AnalysisMode.ALL. This used to be only CustomAnalyzer in the past, but with the introduction of the ReloadableCustomAnalyzer this needs to be added as an option where the analysis mode gets propagated. Closes #44625	2019-07-19 18:18:43 +02:00
Ryan Ernst	60785a9fa8	Convert several direct uses of Streamable to Writeable (#44586 ) (#44604 ) This commit converts several utility classes that implement Streamable to have StreamInput constructors. It also adds a default version of readFrom to Streamable so that overriding to throw UOE is not necessary. relates #34389	2019-07-18 21:25:44 -07:00
Ryan Ernst	13f46aa801	Convert index and persistent actions/response to writeable (#44582 ) (#44601 ) This commit converts several more classes from streamable to writeable in server, mostly within the o.e.index and o.e.persistent packages. relates #34389	2019-07-18 18:32:09 -07:00
Tal Levy	03f5084ac7	remove usages of #readOptionalStreamable, #readStreamableList. (#44578 ) (#44598 ) This commit removes references to Streamable from StreamInput. This is all a part of the effort to remove Streamable usage. relates #34389.	2019-07-18 16:19:02 -07:00
Lee Hinman	3001f7941f	Allow empty configuration for SLM policies (#44465 ) * Allow empty configuration for SLM policies When putting or updating a snapshot lifecycle policy it was not possible to elide the `config` map. This commit makes the configuration optional, the same way that it is when taking a snapshot. Relates to #38461 * Add Objects.requireNonNull for required parts of the policy	2019-07-18 16:20:31 -06:00
Lee Hinman	fe2ef66e45	Expose index age in ILM explain output (#44457 ) * Expose index age in ILM explain output This adds the index's age to the ILM explain output, for example: ``` { "indices" : { "ilm-000001" : { "index" : "ilm-000001", "managed" : true, "policy" : "full-lifecycle", "lifecycle_date" : "2019-07-16T19:48:22.294Z", "lifecycle_date_millis" : 1563306502294, "age" : "1.34m", "phase" : "hot", "phase_time" : "2019-07-16T19:48:22.487Z", ... etc ... } } } ``` This age can be used to tell when ILM will transition the index to the next phase, based on that phase's `min_age`. Resolves #38988 * Expose age in getters and in HLRC	2019-07-18 15:33:45 -06:00
Tal Levy	c8a8915b27	migrate rollup/monitoring/graph/watcher actions to Writeable (#44464 ) (#44538 ) this commit migrates leftover actions from a few x-pack plugins to the new Writeable.Reader infrastructure. relates #34389.	2019-07-18 08:42:56 -07:00
Christoph Büscher	e9f257b4d9	Fix ReloadDetailsTests compilation	2019-07-18 11:55:22 +02:00
Christoph Büscher	eed2db8947	Add stream serialization to ReloadAnalyzersResponse (#44420 ) This change adds Writeable support to ReloadAnalyzersResponse that is required when using the transport client in 7.x. Closes #44383	2019-07-18 11:11:54 +02:00
David Kyle	0fc091f166	Enable XLint warnings for ML (#44346 ) Removes the warning suppression -Xlint:-deprecation,-rawtypes,-serial,-try,-unchecked. Many warnings were unchecked warnings in the test code often because of the use of mocks. These are suppressed with @SuppressWarning	2019-07-18 09:33:37 +01:00
Ryan Ernst	edd26339c5	Convert remaining request classes in xpack core to writeable.reader (#44524 ) (#44534 ) This commit converts all remaining classes extending ActionRequest in xpack core to have a StreamInput constructor. relates #34389	2019-07-18 01:11:45 -07:00
Tal Levy	a5ad59451c	migrate more ML actions off of using Request suppliers (#44462 ) (#44529 ) many classes still use the Streamable constructors of HandledTransportAction, this commit moves more of those classes to the new Writeable constructors. relates #34389.	2019-07-17 20:28:29 -07:00
Tal Levy	075a3f0e99	remove usage of ActionType#(String) (#44459 ) (#44526 ) this commit removes usage of the deprecated constructor with a single argument and no Writeable.Reader. The purpose of this is to reduce the boilerplate necessary for properly implementing a new action, as well as reducing the chances of using the incorrect super constructor while classes are being migrated to Writeable relates #34389.	2019-07-17 20:28:11 -07:00
Ryan Ernst	2a2686e6e7	Convert remaining ActionTypes to writeable in xpack core (#44467 ) (#44525 ) This commit converts all remaining ActionType response classes to writeable in xpack core. It also converts a few from server which were used by xpack core. relates #34389	2019-07-17 18:01:45 -07:00
Ryan Ernst	17c4b2b839	Convert MasterNodeRequest to implement Writeable.Reader (#44452 ) (#44513 ) This commit converts all MasterNodeRequest subclasses to fullfill Writeable.Reader constructors. relates #34389	2019-07-17 18:01:29 -07:00
Ryan Ernst	0755a13c9f	Convert AcknowledgedRequest to Writeable.Reader (#44412 ) (#44454 ) This commit adds constructors to AcknolwedgedRequest subclasses to implement Writeable.Reader, and ensures all future subclasses implement the same. relates #34389	2019-07-17 11:17:36 -07:00
Yannick Welsch	d98b3e4760	Move frozen indices to x-pack module (#44490 ) Backport of #44408 and #44286.	2019-07-17 16:53:10 +02:00
Ryan Ernst	6e50bafa8f	Convert Broadcast request and response to use writeable.reader (#44386 ) (#44453 ) This commit converts the request and response classes for broadcast actions to implement ctors for Writeable.Reader and forces all future implementations to implement the same. relates #34389	2019-07-16 23:24:02 -07:00
Tal Levy	901310a826	[7.x] Migrate ML Actions to use writeable ActionType (#44302 ) (#44391 ) * Migrate ML Actions to use writeable ActionType (#44302) This commit converts all the StreamableResponseActionType actions in the ML core module to be ActionType and leverage the Writeable infrastructure.	2019-07-16 12:41:10 -07:00
Przemysław Witek	9613700a63	[7.x] Implement MlConfigIndexMappingsFullClusterRestartIT test which verifies that .ml-config index mappings are properly updated during cluster upgrade (#44341 ) (#44366 )	2019-07-16 21:22:40 +02:00
James Baiera	a3ba11cfc1	Improve CryptoService error message on missing secure file (#43623 ) (#44364 ) This improves the error message when encrypting of sensitive watcher data is configured, but no system file was specified in the keystore. This error message is displayed on startup. This also closes the input stream of the secure file properly. Closes #43619	2019-07-16 13:58:57 -04:00
Benjamin Trent	2c7ff812da	[ML] Add r_squared eval metric to regression (#44248 ) (#44378 ) * [ML] Add r_squared eval metric to regression * fixing tests and binarysoftclassification class * Update RSquared.java * Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/dataframe/evaluation/regression/RSquared.java Co-Authored-By: David Kyle <david.kyle@elastic.co> * removing unnecessary debug test	2019-07-16 11:11:31 -05:00
Lee Hinman	fb0461ac76	[7.x] Add Snapshot Lifecycle Management (#44382 ) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It does record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "/30 * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}	2019-07-16 07:37:13 -06:00
Hendrik Muhs	6c1f740759	[ML-DataFrame] make checkpointing more robust (#44344 ) (#44414 ) make checkpointing more robust: - do not let checkpointing fail if indexes got deleted - treat missing seqNoStats as just created indices (checkpoint 0) - loglevel: do not treat failed updated checks as error fixes #43992	2019-07-16 13:43:13 +02:00
Przemysław Witek	3f3a3d3f2b	[7.x] Add DatafeedTimingStats.average_search_time_per_bucket_ms and TimingStats.total_bucket_processing_time_ms stats (#44125 ) (#44404 )	2019-07-16 12:51:29 +02:00
Ryan Ernst	c4cf98c538	Convert core security actions to use writeable ActionType (#44359 ) (#44390 ) This commit converts all the StreamableResponseActionType security classes in xpack core to ActionType, implementing Writeable for their response classes. relates #34389	2019-07-16 01:11:13 -07:00
Jason Tedor	be98a12cd0	Do not swallow I/O exception getting authentication (#44398 ) When getting authentication info from the thread context, it might be that we encounter an I/O exception. Today we swallow this exception and return a null authentication info to the caller. Yet, this could be hiding bugs or errors. This commits adjusts this behavior so that we no longer swallow the exception.	2019-07-16 16:14:15 +09:00
Ryan Ernst	e0b82e92f3	Convert BaseNode(s) Request/Response classes to Writeable (#44301 ) (#44358 ) This commit converts all BaseNodeResponse and BaseNodesResponse subclasses to implement Writeable.Reader instead of Streamable. relates #34389	2019-07-15 18:07:52 -07:00
Yannick Welsch	a848fc9bf4	Revert "Add usage stats for frozen indices (#44286 )" This reverts commit `5e73c49ec8`.	2019-07-15 21:41:25 +02:00
Yannick Welsch	7b68bfb4e6	Revert "Add frozen indices usage for all but transport client (#44286 )" This reverts commit `d2d40afc02`.	2019-07-15 21:41:21 +02:00
Yannick Welsch	d2d40afc02	Add frozen indices usage for all but transport client (#44286 ) Backport gone wrong.	2019-07-15 20:49:23 +02:00
Ryan Ernst	59658daef9	Separate streamable based master node actions (#44313 ) This commit creates new base classes for master node actions whose response types still implement Streamable. This simplifies both finding remaining classes to convert, as well as creating new master node actions that use Writeable for their responses. relates #34389	2019-07-15 09:20:20 -07:00
Yannick Welsch	5e73c49ec8	Add usage stats for frozen indices (#44286 ) Adds usage stats for frozen indices of the form: "frozen_indices" : { "available" : true, "enabled" : true, "indices_count" : 0 }	2019-07-15 17:34:46 +02:00
Ryan Ernst	fc6a31e141	Use specific version constant for wire bwc check (#44316 ) This commit modifies bwc behavior in FindFileStructureAction to check against a concrete version instead of Version.CURRENT. Checking against Version.CURRENT does not work since it is changing, in addition to it having different meanings on each branch. relates #42501	2019-07-14 19:05:14 -07:00
Hendrik Muhs	684b562381	[7.x][ML-DataFrame] Rewrite continuous logic to prevent terms count limit (#44287 ) Rewrites how continuous data frame transforms calculates and handles buckets that require an update. Instead of storing the whole set in memory, it pages through the updates using a 2nd cursor. This lowers memory consumption and prevents problems with limits at query time (max_terms_count). The list of updates can be re-retrieved in a failure case (#43662)	2019-07-13 06:58:04 +02:00
Ryan Ernst	1dcf53465c	Reorder HandledTransportAction ctor args (#44291 ) This commit moves the Supplier variant of HandledTransportAction to have a different ordering than the Writeable.Reader variant. The Supplier version is used for the legacy Streamable, and currently having the location of the Writeable.Reader vs Supplier in the same place forces using casts of Writeable.Reader to select the correct super constructor. This change in ordering allows easier migration to Writeable.Reader. relates #34389	2019-07-12 13:45:09 -07:00
Benjamin Trent	79c62fd724	[ML][Data Frame] Fixing default delay set in timesync (#44281 ) (#44293 ) * [ML][Data Frame] Fixing default delay set in timesync * disallowing explicit null, don't do duration check on write	2019-07-12 15:21:47 -05:00
Benjamin Trent	68cd675892	[ML][Data Frame] responding with 409 status code when failing _stop (#44231 ) (#44276 ) * [ML][Data Frame] responding with appropriate status code when failing _stop * adding null checks for persistent task data * addressing PR comments	2019-07-12 10:10:24 -05:00
Przemysław Witek	dd5f4ae00e	Update .ml-config mappings before indexing job, datafeed or df analytics config (#44216 ) (#44273 )	2019-07-12 16:49:48 +02:00
Benjamin Trent	40cc081ad3	[ML][Data Frame] adds index validations to _start data frame transform (#44191 ) (#44227 ) * [ML][Data Frame] adds index validations to _start data frame transform * addressing pr comments	2019-07-11 12:50:50 -05:00
Benjamin Trent	c82d9c5b50	[ML] Adds support for regression.mean_squared_error to eval API (#44140 ) (#44218 ) * [ML] Adds support for regression.mean_squared_error to eval API * addressing PR comments * fixing tests	2019-07-11 09:22:52 -05:00
Ryan Ernst	fb77d8f461	Removed writeTo from TransportResponse and ActionResponse (#44092 ) The base classes for transport requests and responses currently implement Streamable and Writeable. The writeTo method on these base classes is implemented with an empty implementation. Not only does this complicate subclasses to think they need to call super.writeTo, but it also can lead to not implementing writeTo when it should have been implemented, or extendiong one of these classes when not necessary, since there is nothing to actually implement. This commit removes the empty writeTo from these base classes, and fixes subclasses to not call super and in some cases implement an empty writeTo themselves. relates #34389	2019-07-10 12:42:04 -07:00
Michael Basnight	d2c3f4bae9	Validate read priv of enrich source indices (#43595 ) This commit adds permissions validation on the indices provided in the enrich policy. These indices should be validated at store time so as not to have cryptic error messages in the event the user does not have permissions to access said indices.	2019-07-10 13:09:10 -05:00
Zachary Tong	92ad588275	Remove generic on AggregatorFactory (#43664 ) (#44079 ) AggregatorFactory was generic over itself, but it doesn't appear we use this functionality anywhere (e.g. to allow the super class to declare arguments/return types generically for subclasses to override). Most places use a wildcard constraint, and even when a concrete type is specified it wasn't used. But since AggFactories are widely used, this led to the generic touching many pieces of code and making type signatures fairly complex	2019-07-10 13:20:28 -04:00
Przemysław Witek	44781e415e	[7.x] [ML] Add DatafeedTimingStats to datafeed GetDatafeedStatsAction.Response (#43045 ) (#44118 )	2019-07-10 11:51:44 +02:00
David Roberts	cb62d4acdf	[ML-DataFrame] Add a frequency option to transform config, default 1m (#44120 ) Previously a data frame transform would check whether the source index was changed every 10 seconds. Sometimes it may be desirable for the check to be done less frequently. This commit increases the default to 60 seconds but also allows the frequency to be overridden by a setting in the data frame transform config.	2019-07-10 09:59:00 +01:00
Armin Braun	9eac5ceb1b	Dry up inputstream to bytesreference (#43675 ) (#44094 ) * Dry up Reading InputStream to BytesReference * Dry up spots where we use the same pattern to get from an InputStream to a BytesReferences	2019-07-09 09:18:25 +02:00
Dimitris Athanasiou	a1a62fded3	[7.x][ML] Stop df-analytics action request should filter tasks (#44016 ) (#44023 ) As a `BaseTasksRequest`, `StopDataFrameAnalyticsAction.Request` should implement a `match` method that makes sure only df-analytics tasks are applied.	2019-07-05 23:10:45 +03:00
Jim Ferenczi	cdf55cb5c5	Refactor index engines to manage readers instead of searchers (#43860 ) This commit changes the way we manage refreshes in the index engines. Instead of relying on a SearcherManager, this change uses a ReaderManager that creates ElasticsearchDirectoryReader when needed. Searchers are now created on-demand (when acquireSearcher is called) from the current ElasticsearchDirectoryReader. It also slightly changes the Engine.Searcher to extend IndexSearcher in order to simplify the usage in the consumer.	2019-07-04 22:49:43 +02:00
Martijn van Groningen	7ba6e1752a	required changes after merge	2019-07-04 13:17:22 +02:00
Martijn van Groningen	653f1436a0	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-07-04 13:05:10 +02:00
Christoph Büscher	662f517f4e	Add _reload_search_analyzers endpoint to HLRC (#43733 ) This change adds the new endpoint that allows reloading of search analyzers to the high-level java rest client. Relates to #43313	2019-07-03 12:05:59 +02:00
Dimitris Athanasiou	96b0b27f18	[7.x][ML] Set df-analytics task state to failed when appropriate (#43880 ) (#43906 ) This introduces a `failed` state to which the data frame analytics persistent task is set to when something unexpected fails. It could be the process crashing, the results processor hitting some error, etc. The failure message is then captured and set on the task state. From there, it becomes available via the _stats API as `failure_reason`. The df-analytics stop API now has a `force` boolean parameter. This allows the user to call it for a failed task in order to reset it to `stopped` after we have ensured the failure has been communicated to the user. This commit also adds the analytics version in the persistent task params as this allows us to prevent tasks to run on unsuitable nodes in the future.	2019-07-03 12:41:56 +03:00
Alexander Reelsen	9077c4402f	Watcher: Allow to execute actions for each element in array (#41997 ) This adds the ability to execute an action for each element that occurs in an array, for example you could sent a dedicated slack action for each search hit returned from a search. There is also a limit for the number of actions executed, which is hardcoded to 100 right now, to prevent having watches run forever. The watch history logs each action result and the total number of actions the were executed. Relates #34546	2019-07-03 11:28:50 +02:00
Tim Vernum	2a8f30eb9a	Support builtin privileges in get privileges API (#43901 ) Adds a new "/_security/privilege/_builtin" endpoint so that builtin index and cluster privileges can be retrieved via the Rest API Backport of: #42134	2019-07-03 19:08:28 +10:00
Tim Vernum	31b19bd022	Use separate BitSet cache in Doc Level Security (#43899 ) Document level security was depending on the shared "BitsetFilterCache" which (by design) never expires its entries. However, when using DLS queries - particularly templated ones - the number (and memory usage) of generated bitsets can be significant. This change introduces a new cache specifically for BitSets used in DLS queries, that has memory usage constraints and access time expiry. The whole cache is automatically cleared if the role cache is cleared. Individual bitsets are cleared when the corresponding lucene index reader is closed. The cache defaults to 50MB, and entries expire if unused for 7 days. Backport of: #43669	2019-07-03 18:04:06 +10:00
Benjamin Trent	fb825a6470	[7.x] [ML][Data Frame] add node attr to GET _stats (#43842 ) (#43894 ) * [ML][Data Frame] add node attr to GET _stats (#43842) * [ML][Data Frame] add node attr to GET _stats * addressing testing issues with node.attributes * adjusting for backport	2019-07-02 19:35:37 -05:00
Christoph Büscher	31cf96e7bf	Return reloaded analyzers in _reload_search_ananlyzer response (#43813 ) Currently the repsonse of the "_reload_search_analyzer" endpoint contains the index names and nodeIds of indices were analyzers reloading was triggered. This change add the names of the search-time analyzers that were reloaded. Closes #43804	2019-07-02 18:51:15 +02:00
Dimitris Athanasiou	1ea53979b5	[7.x][ML] Get df-analytics action should require monitor privilege (#43831 ) (#43866 )	2019-07-02 16:00:54 +03:00
Tim Vernum	8d099dad38	Add "manage_api_key" cluster privilege (#43865 ) This adds a new cluster privilege for manage_api_key. Users with this privilege are able to create new API keys (as a child of their own user identity) and may also get and invalidate any/all API keys (including those owned by other users). Backport of: #43728	2019-07-02 21:57:42 +10:00
Benjamin Trent	b95ee7ebb2	[7.x] [ML][Data Frame] using transform creation version for node assignment (#43764 ) (#43843 ) * [ML][Data Frame] using transform creation version for node assignment (#43764) * [ML][Data Frame] using transform creation version for node assignment * removing unused imports * Addressing PR comment * adjusing for backport	2019-07-02 06:52:34 -05:00
Benjamin Trent	82c1ddc117	[7.x] [ML][Data Frame] Add deduced mappings to _preview response payload (#43742 ) (#43849 ) * [ML][Data Frame] Add deduced mappings to _preview response payload (#43742) * [ML][Data Frame] Add deduced mappings to _preview response payload * updating preview docs * fixing code for backport	2019-07-02 06:52:14 -05:00
Tanguy Leroux	b977f019b8	Expose translog stats in ReadOnlyEngine (#43752 ) (#43823 ) Backport of #43752 for 7.x.	2019-07-02 13:39:00 +02:00
Yogesh Gaikwad	031d5e96ac	HLRC changes for kerberos grant type (#43642 ) (#43822 ) The TODO from last PR for kerbero grant type was missed. This commit adds the changes for kerberos grant type in HLRC.	2019-07-02 00:55:02 +10:00
Benjamin Trent	4c95c0c456	[ML][Data Frame] reduce audit frequency, change log msg, and level (#43771 ) (#43818 )	2019-07-01 09:14:26 -05:00
Julie Tibshirani	ffa5919d7c	Add support for 'flattened object' fields. (#43762 ) This commit merges the `object-fields` feature branch. The new 'flattened object' field type allows an entire JSON object to be indexed into a field, and provides limited search functionality over the field's contents.	2019-07-01 12:08:50 +03:00
Dimitris Athanasiou	3bdb9d5f08	[7.x][ML] Correct df-analytics version introduced to 7.3.0 (#43784 ) (#43795 )	2019-07-01 09:19:04 +03:00
Ryan Ernst	3a2c698ce0	Rename Action to ActionType (#43778 ) Action is a class that encapsulates meta information about an action that allows it to be called remotely, specifically the action name and response type. With recent refactoring, the action class can now be constructed as a static constant, instead of needing to create a subclass. This makes the old pattern of creating a singleton INSTANCE both misnamed and lacking a common placement. This commit renames Action to ActionType, thus allowing the old INSTANCE naming pattern to be TYPE on the transport action itself. ActionType also conveys that this class is also not the action itself, although this change does not rename any concrete classes as those will be removed organically as they are converted to TYPE constants. relates #34389	2019-06-30 22:00:17 -07:00
Martijn van Groningen	adcba69d96	required changes after merge	2019-06-30 21:40:40 +02:00
Martijn van Groningen	eb8e03bc8b	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-06-30 21:32:51 +02:00
Ryan Ernst	28ab77a023	Add StreamableResponseAction to aid in deprecation of Streamable (#43770 ) The Action base class currently works for both Streamable and Writeable response types. This commit intorduces StreamableResponseAction, for which only the legacy Action implementions which provide newResponse() will extend. This eliminates the need for overriding newResponse() with an UnsupportedOperationException. relates #34389	2019-06-28 21:40:00 -07:00
Benjamin Trent	67a3c656c3	[7.x] [ML][Data Frame] removing format support (#43659 ) (#43747 ) * [ML][Data Frame] removing format support (#43659) * Fixing conflicts	2019-06-28 10:02:37 -05:00
Jim Ferenczi	7ca69db83f	Refactor IndexSearcherWrapper to disallow the wrapping of IndexSearcher (#43645 ) This change removes the ability to wrap an IndexSearcher in plugins. The IndexSearcherWrapper is replaced by an IndexReaderWrapper and allows to wrap the DirectoryReader only. This simplifies the creation of the context IndexSearcher that is used on a per request basis. This change also moves the optimization that was implemented in the security index searcher wrapper to the ContextIndexSearcher that now checks the live docs to determine how the search should be executed. If the underlying live docs is a sparse bit set the searcher will compute the intersection betweeen the query and the live docs instead of checking the live docs on every document that match the query.	2019-06-28 16:28:02 +02:00
Dimitris Athanasiou	86c853a7c2	[7.x][ML] Rename outlier score setting to feature_influence_threshold (#43705 ) (#43734 ) Renames outlier score setting `minimum_score_to_write_feature_influence` to `feature_influence_threshold`.	2019-06-28 13:28:25 +03:00
Dimitris Athanasiou	cab879118d	[7.x][ML] Support multiple source indices for df-analytics (#43702 ) (#43731 ) This commit adds support for multiple source indices. In order to deal with multiple indices having different mappings, it attempts a best-effort approach to merge the mappings assuming there are no conflicts. In case conflicts exists an error will be returned. To allow users creating custom mappings for special use cases, the destination index is now allowed to exist before the analytics job runs. In addition, settings are no longer copied except for the `index.number_of_shards` and `index.number_of_replicas`.	2019-06-28 13:28:03 +03:00
Christoph Büscher	2cc7f5a744	Allow reloading of search time analyzers (#43313 ) Currently changing resources (like dictionaries, synonym files etc...) of search time analyzers is only possible by closing an index, changing the underlying resource (e.g. synonym files) and then re-opening the index for the change to take effect. This PR adds a new API endpoint that allows triggering reloading of certain analysis resources (currently token filters) that will then pick up changes in underlying file resources. To achieve this we introduce a new type of custom analyzer (ReloadableCustomAnalyzer) that uses a ReuseStrategy that allows swapping out analysis components. Custom analyzers that contain filters that are markes as "updateable" will automatically choose this implementation. This PR also adds this capability to `synonym` token filters for use in search time analyzers. Relates to #29051	2019-06-28 09:55:40 +02:00
Przemysław Witek	94f18da5df	Add version and create_time to data frame analytics config (#43683 ) (#43712 )	2019-06-28 07:37:21 +02:00
Ryan Ernst	5b4089e57e	Remove nodeId from BaseNodeRequest (#43658 ) TransportNodesAction provides a mechanism to easily broadcast a request to many nodes, and collect the respones into a high level response. Each node has its own request type, with a base class of BaseNodeRequest. This base request requires passing the nodeId to which the request will be sent. However, that nodeId is not used anywhere. It is private to the base class, yet serialized to each node, where the node could just as easily find the nodeId of the node it is on locally. This commit removes passing the nodeId through to the node request creation, and guards its serialization so that we can remove the base request class altogether in the future.	2019-06-27 18:45:14 -07:00
Nhat Nguyen	ce8771feb7	Do not use MockInternalEngine in GatewayIndexStateIT (#43716 ) GatewayIndexStateIT#testRecoverBrokenIndexMetadata replies on the flushing on shutdown. This behaviour, however, can be randomly disabled in MockInternalEngine. Closes #43034	2019-06-27 18:28:04 -04:00
Przemysław Witek	68dbbd8793	Deduplicate two similar TimeUtils classes. (#43697 ) * Deduplicate org.elasticsearch.xpack.core.dataframe.utils.TimeUtils and org.elasticsearch.xpack.core.ml.utils.time.TimeUtils into a common class: org.elasticsearch.xpack.core.common.time.TimeUtils. * Add unit tests for parseTimeField and parseTimeFieldToInstant methods	2019-06-27 18:51:48 +02:00
Jim Ferenczi	329d05f61e	Fix UOE on search requests that match a sparse role query (#43668 ) Search requests executed through the SecurityIndexSearcherWrapper throw an UnsupportedOperationException if they match a sparse role query. When low level cancellation is activated (which is the default since #42857), the context index searcher creates a weight that doesn't handle #scorer. This change fixes this bug and adds a test to ensure that we check this case.	2019-06-27 16:56:56 +02:00
Martijn van Groningen	683e116601	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-06-27 08:35:37 +02:00
Benjamin Trent	52e26bbc42	[ML][Data Frame] improve pivot nested field validations (#43548 ) (#43636 ) * [ML][Data Frame] improve pivot nested field validations * addressing pr comments	2019-06-26 13:35:51 -05:00
Benjamin Trent	c121b00c98	[7.x] [ML][Data Frame] Add support for allow_no_match for endpoints (#43490 ) (#43637 ) * [ML][Data Frame] Add support for allow_no_match for endpoints (#43490) * [ML][Data Frame] Add support for allow_no_match parameter in endpoints Adds support for: * Get Transforms * Get Transforms stats * stop transforms * Update DataFrameTransformDocumentationIT.java	2019-06-26 10:09:56 -05:00
Yannick Welsch	2049f715b3	Add voting-only master node (#43410 ) A voting-only master-eligible node is a node that can participate in master elections but will not act as a master in the cluster. In particular, a voting-only node can help elect another master-eligible node as master, and can serve as a tiebreaker in elections. High availability (HA) clusters require at least three master-eligible nodes, so that if one of the three nodes is down, then the remaining two can still elect a master amongst them-selves. This only requires one of the two remaining nodes to have the capability to act as master, but both need to have voting powers. This means that one of the three master-eligible nodes can be made as voting-only. If this voting-only node is a dedicated master, a less powerful machine or a smaller heap-size can be chosen for this node. Alternatively, a voting-only non-dedicated master node can play the role of the third master-eligible node, which allows running an HA cluster with only two dedicated master nodes. Closes #14340 Co-authored-by: David Turner <david.turner@elastic.co>	2019-06-26 08:07:56 +02:00
Yogesh Gaikwad	480453aa24	Make role descriptors optional when creating API keys (#43481 ) (#43614 ) This commit changes the `role_descriptors` field from required to optional when creating API key. The default behavior in .NET ES client is to omit properties with `null` value requiring additional workarounds. The behavior for the API does not change. Field names (`id`, `name`) in the invalidate api keys API documentation have been corrected where they were wrong. Closes #42053	2019-06-26 14:30:51 +10:00
Przemysław Witek	76a750a0a0	Remove unused mapStringsOrdered method (#42513 ) (#43585 )	2019-06-25 20:43:38 +02:00
Dimitris Athanasiou	126c2fd2d5	[7.x][ML] Machine learning data frame analytics (#43544 ) (#43592 ) This merges the initial work that adds a framework for performing machine learning analytics on data frames. The feature is currently experimental and requires a platinum license. Note that the original commits can be found in the `feature-ml-data-frame-analytics` branch. A new set of APIs is added which allows the creation of data frame analytics jobs. Configuration allows specifying different types of analysis to be performed on a data frame. At first there is support for outlier detection. The APIs are: - PUT _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id}/_stats - POST _ml/data_frame/analysis/{id}/_start - POST _ml/data_frame/analysis/{id}/_stop - DELETE _ml/data_frame/analysis/{id} When a data frame analytics job is started a persistent task is created and started. The main steps of the task are: 1. reindex the source index into the dest index 2. analyze the data through the data_frame_analyzer c++ process 3. merge the results of the process back into the destination index In addition, an evaluation API is added which packages commonly used metrics that provide evaluation of various analysis: - POST _ml/data_frame/_evaluate	2019-06-25 20:29:11 +03:00
Przemysław Witek	c702cd7415	[7.x] Implement XContentParser.genericMap and XContentParser.genericMapOrdered methods (#42059 ) (#43575 )	2019-06-25 16:04:54 +02:00
Przemysław Witek	b15e40ffad	Extract TimingStats-related functionality into TimingStatsReporter (#43371 ) (#43557 )	2019-06-25 15:48:39 +02:00
Martijn van Groningen	36f0e8a8bb	Added multi node enrich tests and fixed serialization issues. (#43386 ) The test for now tests the enrich APIs in a multi node environment. Picked EsIntegTestCase test over a real qa module in order to avoid adding another module that starts a test cluster.	2019-06-25 14:03:10 +02:00
Martijn van Groningen	d0634e444d	Fixed compile errors after merge.	2019-06-25 10:12:16 +02:00
Martijn van Groningen	f587519f17	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-06-25 10:09:51 +02:00
Gordon Brown	fac7efba9a	[7.x] Account for node versions during allocation in ILM Shrink (#43300 ) This commit ensures that ILM's Shrink action will take node versions into account when choosing which node to allocate to when shrinking an index. Prior to this change, ILM could pick a node with a lower version than some shards are already allocated to, which causes the new allocation to fail as shards can't be relocated onto a node with a lower version than they are already on. As part of this, when making the decision about which node to allocate to prior to Shrink, all shards in the index are considered, rather than choosing a random shard to consider. Further, the unit tests for the logic that chooses a node to allocate shards to pre-shrink has been improved to validate the behavior in more realistic and varied initial conditions.	2019-06-24 10:02:49 -06:00
Michael Basnight	6945e5d5e6	Add role for enrich processor (#42677 ) This commit adds the manage_enrich privilege, which grants access to all of the enrich processor lifecycle actions. In addition this commit also creates a role which grants access to the generated indices. Relates #41939 Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>	2019-06-24 10:47:01 -05:00
Martijn van Groningen	101cf384ba	Replace Streamable w/ Writable in AcknowledgedResponse and subclasses (backport 7.x) (#43525 ) This commit replaces usages of Streamable with Writeable for the AcknowledgedResponse and its subclasses, plus associated actions. Note that where possible response fields were made final and default constructors were removed. This is a large PR, but the change is mostly mechanical. Relates to #34389 Backport of #43414	2019-06-24 13:47:37 +02:00
Martijn van Groningen	df9f06213d	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-06-21 19:58:04 +02:00
Benjamin Trent	f4b75d6d14	[7.x] [ML][Data Frame] Add version and create_time to transform config (#43384 ) (#43480 ) * [ML][Data Frame] Add version and create_time to transform config (#43384) * [ML][Data Frame] Add version and create_time to transform config * s/transform_version/version s/Date/Instant * fixing getter/setter for version * adjusting for backport	2019-06-21 09:11:44 -05:00
Simon Willnauer	424ef4f158	SecurityIndexSearcherWrapper doesn't always carry over caches and similarity (#43436 ) If DocumentLevelSecurity is enabled SecurityIndexSearcherWrapper doesn't carry over the cache, cache policy and similarity from the incoming searcher.	2019-06-21 10:19:10 +02:00
Tim Vernum	059eb55108	Use SecureString for password length validation (#43465 ) This replaces the use of char[] in the password length validation code, with the use of SecureString Although the use of char[] is not in itself problematic, using a SecureString encourages callers to think about the lifetime of the password object and to clear it after use. Backport of: #42884	2019-06-21 17:11:07 +10:00
Yannick Welsch	7f8e1454ab	Advance checkpoints only after persisting ops (#43205 ) Local and global checkpoints currently do not correctly reflect what's persisted to disk. The issue is that the local checkpoint is adapted as soon as an operation is processed (but not fsynced yet). This leaves room for the history below the global checkpoint to still change in case of a crash. As we rely on global checkpoints for CCR as well as operation-based recoveries, this has the risk of shard copies / follower clusters going out of sync. This commit required changing some core classes in the system: - The LocalCheckpointTracker keeps track now not only of the information whether an operation has been processed, but also whether that operation has been persisted to disk. - TranslogWriter now keeps track of the sequence numbers that have not been fsynced yet. Once they are fsynced, TranslogWriter notifies LocalCheckpointTracker of this. - ReplicationTracker now keeps track of the persisted local and persisted global checkpoints of all shard copies when in primary mode. The computed global checkpoint (which represents the minimum of all persisted local checkpoints of all in-sync shard copies), which was previously stored in the checkpoint entry for the local shard copy, has been moved to an extra field. - The periodic global checkpoint sync now also takes async durability into account, where the local checkpoints on shards only advance when the translog is asynchronously fsynced. This means that the previous condition to detect inactivity (max sequence number is equal to global checkpoint) is not sufficient anymore. - The new index closing API does not work when combined with async durability. The shard verification step is now requires an additional pre-flight step to fsync the translog, so that the main verify shard step has the most up-to-date global checkpoint at disposition.	2019-06-20 11:12:38 +02:00
Martijn van Groningen	9de4e878f7	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-06-20 09:44:31 +02:00
Jason Tedor	1f1a035def	Remove stale test logging annotations (#43403 ) This commit removes some very old test logging annotations that appeared to be added to investigate test failures that are long since closed. If these are needed, they can be added back on a case-by-case basis with a comment associating them to a test failure.	2019-06-19 22:58:22 -04:00
Benjamin Trent	77ce3260dd	[ML][Data Frame] make response.count be total count of hits (#43241 ) (#43389 ) * [ML][Data Frame] make response.count be total count of hits * addressing line length check * changing response count for filters * adjusting serialization, variable name, and total count logic * making count mandatory for creation	2019-06-19 16:19:06 -05:00
Benjamin Trent	b333ced5a7	[7.x] [ML][Data Frame] adds new pipeline field to dest config (#43124 ) (#43388 ) * [ML][Data Frame] adds new pipeline field to dest config (#43124) * [ML][Data Frame] adds new pipeline field to dest config * Adding pipeline support to _preview * removing unused import * moving towards extracting _source from pipeline simulation * fixing permission requirement, adding _index entry to doc * adjusting for java 8 compatibility * adjusting bwc serialization version to 7.3.0	2019-06-19 16:18:27 -05:00
Christos Soulios	d1637ca476	Backport: Refactor aggregation base classes to remove doEquals() and doHashCode() (#43363 ) This PR is a backport a of #43214 from v8.0.0 A number of the aggregation base classes have an abstract doEquals() and doHashCode() (e.g. InternalAggregation.java, AbstractPipelineAggregationBuilder.java). Theoretically this is so the sub-classes can add to the equals/hashCode and don't need to worry about calling super.equals(). In practice, it's mostly just confusing/inconsistent. And if there are more than two levels, we end up with situations like InternalMappedSignificantTerms which has to call super.doEquals() which defeats the point of having these overridable methods. This PR removes the do versions and just use equals/hashCode ensuring the super when necessary.	2019-06-19 22:31:06 +03:00
Gordon Brown	23a3471394	Fix randomization in testPerformActionAttrsRequestFails (#43304 ) The randomization in this test would occasionally generate duplicate node attribute keys, causing spurious test failures. This commit adjusts the randomization to not generate duplicate keys and cleans up the data structure used to hold the generated keys.	2019-06-19 10:34:39 -06:00
Yogesh Gaikwad	2f173402ec	Add kerberos grant_type to get token in exchange for Kerberos ticket (#42847 ) (#43355 ) Kibana wants to create access_token/refresh_token pair using Token management APIs in exchange for kerberos tickets. `client_credentials` grant_type requires every user to have `cluster:admin/xpack/security/token/create` cluster privilege. This commit introduces `_kerberos` grant_type for generating `access_token` and `refresh_token` in exchange for a valid base64 encoded kerberos ticket. In addition, `kibana_user` role now has cluster privilege to create tokens. This allows Kibana to create access_token/refresh_token pair in exchange for kerberos tickets. Note: The lifetime from the kerberos ticket is not used in ES and so even after it expires the access_token/refresh_token pair will be valid. Care must be taken to invalidate such tokens using token management APIs if required. Closes #41943	2019-06-19 18:26:52 +10:00
Mayya Sharipova	aa6248d4d7	Move dense_vector and sparse_vector to module (#43280 ) (#43333 )	2019-06-18 11:56:04 -04:00
Lee Hinman	21da84edbc	Make ILM force merging best effort (#43246 ) It's possible for force merges kicked off by ILM to silently stop (due to a node relocating for example). In which case, the segment count may not reach what the user configured. In the subsequent `SegmentCountStep` waiting for the expected segment count may wait indefinitely. Because of this, this commit makes force merges "best effort" and then changes the `SegmentCountStep` to simply report (at INFO level) if the merge was not successful. Relates to #42824 Resolves #43245	2019-06-17 08:45:22 -06:00
Przemysław Witek	b2613a123d	[7.x] Report exponential_avg_bucket_processing_time which gives more weight to recent buckets (#43189 ) (#43263 )	2019-06-17 08:58:26 +02:00
Przemysław Witek	65a584b6fb	[7.x] Report timing stats as part of the Job stats response (#42709 ) (#43193 )	2019-06-14 09:03:14 +02:00
Jason Tedor	5bc3b7f741	Enable node roles to be pluggable (#43175 ) This commit introduces the possibility for a plugin to introduce additional node roles.	2019-06-13 15:15:48 -04:00
Martijn van Groningen	1f3db7eb3e	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-06-13 16:49:38 +02:00
Simon Willnauer	f70141c862	Only load FST off heap if we are actually using mmaps for the term dictionary (#43158 ) Given the significant performance impact that NIOFS has when term dicts are loaded off-heap this change enforces FstLoadMode#AUTO that loads term dicts off heap only if the underlying index input indicates a memory map. Relates to #43150	2019-06-13 07:54:02 +02:00
David Kyle	597ae5c7b8	[ML DataFrame] Reject Data Frame Ids containing upper case characters (#43145 )	2019-06-12 18:13:18 +01:00
Benjamin Trent	7ff3d86cf0	[ML][Data Frame] adding dest.index and id validations (#43053 ) (#43109 ) * [ML][Data Frame] adding dest.index and id validations * adjusting message format * Adjusting id validity pattern * Update DataFrameStrings.java	2019-06-11 15:55:18 -05:00
Ryan Ernst	172cd4dbfa	Remove description from xpack feature sets (#43065 ) The description field of xpack featuresets is optionally part of the xpack info api, when using the verbose flag. However, this information is unnecessary, as it is better left for documentation (and the existing descriptions describe anything meaningful). This commit removes the description field from feature sets.	2019-06-11 09:22:58 -07:00
Dimitris Athanasiou	76a92b49a8	[ML] Get resources action should be lenient when sort field is unmapped (#42991 ) (#43046 ) Get resources action sorts on the resource id. When there are no resources at all, then it is possible the index does not contain a mapping for the resource id field. In that case, the search api fails by default. This commit adjusts the search request to ignore unmapped fields. Closes elastic/kibana#37870	2019-06-10 19:50:19 +03:00
Alan Woodward	8e23e4518a	Move construction of custom analyzers into AnalysisRegistry (#42940 ) Both TransportAnalyzeAction and CategorizationAnalyzer have logic to build custom analyzers for index-independent analysis. A lot of this code is duplicated, and it requires the AnalysisRegistry to expose a number of internal provider classes, as well as making some assumptions about when analysis components are constructed. This commit moves the build logic directly into AnalysisRegistry, reducing the registry's API surface considerably.	2019-06-10 14:33:25 +01:00
Jason Tedor	63bad28005	Do not allow modify aliases on followers (#43017 ) Now that aliases are replicated by a follower from its leader, this commit prevents directly modifying aliases on follower indices.	2019-06-09 22:53:54 -04:00
Jason Tedor	915d2f2daa	Refactor put mapping request validation for reuse (#43005 ) This commit refactors put mapping request validation for reuse. The concrete case that we are after here is the ability to apply effectively the same framework to indices aliases requests. This commit refactors the put mapping request validation framework to allow for that.	2019-06-09 10:19:04 -04:00
Tim Vernum	090d42d3e6	Permit API Keys on Basic License (#42973 ) Kibana alerting is going to be built using API Keys, and should be permitted on a basic license. This commit moves API Keys (but not Tokens) to the Basic license Relates: elastic/kibana#36836 Backport of: #42787	2019-06-07 14:18:05 +10:00
henryptung	61b62125b8	Wire query cache into sorting nested-filter computation (#42906 ) Don't use Lucene's default query cache when filtering in sort. Closes #42813	2019-06-06 21:16:58 +02:00
Ioannis Kakavas	ba5b1d3249	Mute failing testPerformActionAttrsRequestFails (#42933 )	2019-06-06 14:28:50 +03:00
David Roberts	b202a59f88	[ML] Add earliest and latest timestamps to field stats (#42890 ) This change adds the earliest and latest timestamps into the field stats for fields of type "date" in the output of the ML find_file_structure endpoint. This will enable the cards for date fields in the file data visualizer in the UI to be made to look more similar to the cards for date fields in the index data visualizer in the UI.	2019-06-06 08:58:35 +01:00
Przemyslaw Gomulka	cfdb1b771e	Enable console audit logs for docker backport#42671 #42887 Enable audit logs in docker by creating console appenders for audit loggers. also rename field @timestamp to timestamp and add field type with value audit The docker build contains now two log4j configuration for oss or default versions. The build now allows override the default configuration. Also changed the format of a timestamp from ISO8601 to include time zone as per this discussion #36833 (comment) closes #42666 backport#42671	2019-06-05 17:15:37 +02:00
Benjamin Trent	293f306b9a	[ML][Data Frame] forcing that no ptask => STOPPED state (#42800 ) (#42860 ) * [ML][Data Frame] forcing that no ptask => STOPPED state * Addressing side-effect, early exit for stop when stopped	2019-06-05 07:09:34 -05:00
Simon Willnauer	ebec118ccf	Bring back ExecutionException after backport	2019-06-05 12:10:02 +02:00
Simon Willnauer	41a9f3ae3b	Use reader attributes to control term dict memory useage (#42838 ) This change makes use of the reader attributes added in LUCENE-8671 to ensure that `_id` fields are always on-heap for best update performance and term dicts are generally off-heap on Read-Only engines. Closes #38390	2019-06-05 11:01:06 +02:00
Jason Tedor	117df87b2b	Replicate aliases in cross-cluster replication (#42875 ) This commit adds functionality so that aliases that are manipulated on leader indices are replicated by the shard follow tasks to the follower indices. Note that we ignore write indices. This is due to the fact that follower indices do not receive direct writes so the concept is not useful. Relates #41815	2019-06-04 20:36:24 -04:00
Mark Vieira	e44b8b1e2e	[Backport] Remove dependency substitutions 7.x (#42866 ) * Remove unnecessary usage of Gradle dependency substitution rules (#42773) (cherry picked from commit 12d583dbf6f7d44f00aa365e34fc7e937c3c61f7)	2019-06-04 13:50:23 -07:00
David Roberts	b61202b0a8	[ML] Add a limit on line merging in find_file_structure (#42501 ) When analysing a semi-structured text file the find_file_structure endpoint merges lines to form multi-line messages using the assumption that the first line in each message contains the timestamp. However, if the timestamp is misdetected then this can lead to excessive numbers of lines being merged to form massive messages. This commit adds a line_merge_size_limit setting (default 10000 characters) that halts the analysis if a message bigger than this is created. This prevents significant CPU time being spent subsequently trying to determine the internal structure of the huge bogus messages.	2019-06-03 13:45:51 +01:00
Benjamin Trent	0253927ec4	[ML Data Frame] Refactor stop logic (#42644 ) (#42763 ) * Revert "invalid test" This reverts commit 9dd8b52c13c716918ff97e6527aaf43aefc4695d. * Testing * mend * Revert "[ML Data Frame] Mute Data Frame tests" This reverts commit 5d837fa312b0e41a77a65462667a2d92d1114567. * Call onStop and onAbort outside atomic update * Don’t update CS * Tidying up * Remove invalid test that asserted logic that has been removed * Add stopped event * Revert "Add stopped event" This reverts commit 02ba992f4818bebd838e1c7678bd2e1cc090bfab. * Adding check for STOPPED in saveState	2019-06-03 06:53:44 -05:00
David Roberts	87ca762573	[ML] Add Kibana application privilege to data frame admin/user roles (#42757 ) Data frame transforms are restricted by different roles to ML, but share the ML UI. To prevent the ML UI being hidden for users who only have the data frame admin or user role, it is necessary to add the ML Kibana application privilege to the backend data frame roles.	2019-05-31 15:41:59 +01:00
Przemysław Witek	f6779de2b7	Increase maximum forecast interval to 10 years. (#41082 ) (#42710 ) Increase the maximum duration to ~10 years (3650 days).	2019-05-31 06:19:47 +02:00
James Baiera	215170b6c3	Merge branch '7.x' into enrich-7.x	2019-05-30 16:13:06 -04:00
Jason Tedor	371cb9a8ce	Remove Log4j 1.2 API as a dependency (#42702 ) We had this as a dependency for legacy dependencies that still needed the Log4j 1.2 API. This appears to no longer be necessary, so this commit removes this artifact as a dependency. To remove this dependency, we had to fix a few places where we were accidentally relying on Log4j 1.2 instead of Log4j 2 (easy to do, since both APIs were on the compile-time classpath). Finally, we can remove our custom Netty logger factory. This was needed when we were on Log4j 1.2 and handled logging in our own unique way. When we migrated to Log4j 2 we could have dropped this dependency. However, even then Netty would still pick up Log4j 1.2 since it was on the classpath, thus the advantage to removing this as a dependency now.	2019-05-30 16:08:07 -04:00
Jay Modi	711de2f59a	Make hashed token ids url safe (#42651 ) This commit changes the way token ids are hashed so that the output is url safe without requiring encoding. This follows the pattern that we use for document ids that are autogenerated, see UUIDs and the associated classes for additional details.	2019-05-30 10:44:41 -06:00
Hendrik Muhs	345ff21ae5	[ML-DataFrame] rewrite start and stop to answer with acknowledged (#42589 ) rewrite start and stop to answer with acknowledged fixes #42450	2019-05-29 11:14:32 +02:00
Michael Basnight	77eed9e6a0	Add enrich policy GET API (#41384 ) This commit wires up the Rest calls and Transport calls for GET enrich policy, as well as tests and rest spec additions.	2019-05-28 23:19:23 -05:00
Michael Basnight	be60125a4e	Merge branch '7.x' into enrich-7.x	2019-05-28 18:32:18 -05:00
Hendrik Muhs	ace96a2b6e	check position before and after latch (#42623 ) check position before and after latch #fixes 42084	2019-05-28 21:23:29 +02:00
Daniel Mitterdorfer	635ce0ca6d	Mute AsyncTwoPhaseIndexerTests#testStateMachine() (#42610 ) Relates #42084 Relates #42609	2019-05-28 10:20:22 +02:00
Armin Braun	a5ca20a250	Some Cleanup in o.e.i.engine (#42278 ) (#42566 ) * Some Cleanup in o.e.i.engine * Remove dead code and parameters * Reduce visibility in some obvious spots * Add missing `assert`s (not that important here since the methods themselves will probably be dead-code eliminated) but still	2019-05-27 11:04:54 +02:00
Martijn van Groningen	a91cec4c46	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-05-27 10:18:02 +02:00
Tanguy Leroux	6bec876682	Improve Close Index Response (#39687 ) This changes the `CloseIndexResponse` so that it reports closing result for each index. Shard failures or exception are also reported per index, and the global acknowledgment flag is computed from the index results only. The response looks like: ``` { "acknowledged" : true, "shards_acknowledged" : true, "indices" : { "docs" : { "closed" : true } } } ``` The response reports shard failures like: ``` { "acknowledged" : false, "shards_acknowledged" : false, "indices" : { "docs-1" : { "closed" : true }, "docs-2" : { "closed" : false, "shards" : { "1" : { "failures" : [ { "shard" : 1, "index" : "docs-2", "status" : "BAD_REQUEST", "reason" : { "type" : "index_closed_exception", "reason" : "closed", "index_uuid" : "JFmQwr_aSPiZbkAH_KEF7A", "index" : "docs-2" } } ] } } }, "docs-3" : { "closed" : true } } } ``` Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>	2019-05-24 21:57:55 -04:00
Hendrik Muhs	6d47ee9268	[ML-DataFrame] add support for fixed_interval, calendar_interval, remove interval (#42427 ) * add support for fixed_interval, calendar_interval, remove interval * adapt HLRC * checkstyle * add a hlrc to server test * adapt yml test * improve naming and doc * improve interface and add test code for hlrc to server * address review comments * repair merge conflict * fix date patterns * address review comments * remove assert for warning * improve exception message * use constants	2019-05-24 20:30:17 +02:00
Michael Basnight	2325ffb757	Add enrich policy execute API (#41762 ) This commit wires up the Rest calls and Transport calls for execute enrich policy, as well as tests and rest spec additions.	2019-05-24 09:39:29 -05:00
Martijn van Groningen	79fa7d8098	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-05-24 11:44:35 +02:00
Tim Vernum	567c0d331f	Fix settings prefix for realm truststore password (#42413 ) As part of #30241 realm settings were changed to be true affix settings. In the process of this change, the "ssl." prefix was lost from the realm truststore password. It should be: xpack.security.authc.realms.<type>.<name>.ssl.truststore.password Due to a mismatch between the way we define SSL settings and load SSL contexts, there was no way to define this legacy password setting in a realm config. The settings validation would reject "ssl.truststore.password" but the SSL service would ignore "truststore.password" Backport of: #42336	2019-05-24 13:16:26 +10:00
Mengwei Ding	fa98cbe320	Add .code_internal-* index pattern to kibana user (#42247 ) (#42387 )	2019-05-22 20:25:45 -07:00
Luca Cavanna	c2af62455f	Cut over SearchResponse and SearchTemplateResponse to Writeable (#41855 ) Relates to #34389	2019-05-22 18:47:54 +02:00
Martijn van Groningen	9e514cb161	Remove schedule field from EnrichPolicy (#42143 )	2019-05-22 17:13:54 +02:00
Simon Willnauer	a79cd77e5c	Remove IndexShard dependency from Repository (#42213 ) * Remove IndexShard dependency from Repository In order to simplify repository testing especially for BlobStoreRepository it's important to remove the dependency on IndexShard and reduce it to Store and MapperService (in the snapshot case). This significantly reduces the dependcy footprint for Repository and allows unittesting without starting nodes or instantiate entire shard instances. This change deprecates the old method signatures and adds a unittest for FileRepository to show the advantage of this change. In addition, the unittesting surfaced a bug where the internal file names that are private to the repository were used in the recovery stats instead of the target file names which makes it impossible to relate to the actual lucene files in the recovery stats. * don't delegate deprecated methods * apply comments * test	2019-05-22 14:27:11 +02:00
Hendrik Muhs	ad24231c1a	[ML-DataFrame] validate group name to not contain invalid characters (#42292 ) disallows of creating groupBy field with '[', ']', '>' in the name to be consistent with aggregations	2019-05-22 09:39:59 +02:00
Hendrik Muhs	3493f3b637	move latch await to doNextSearch (#42275 ) move latch await to doNextSearch, fixes a race condition when the executor thread is faster than the coordinator thread fixes #42084	2019-05-22 09:39:59 +02:00
Ioannis Kakavas	34dda75cdf	Ensure SHA256 is not used in tests (#42289 ) SHA256 was recently added to the Hasher class in order to be used in the TokenService. A few tests were still using values() to get the available algorithms from the Enum and it could happen that SHA256 would be picked up by these. This change adds an extra convenience method (Hasher#getAvailableAlgoCacheHash) and enures that only this and Hasher#getAvailableAlgoStoredHash are used for getting the list of available password hashing algorithms in our tests.	2019-05-22 09:54:24 +03:00
Ioannis Kakavas	cdf9485e33	Allow Kibana user to use the OpenID Connect APIs (#42305 ) Add the manage_oidc privilege to the kibana user and to the role privileges list	2019-05-22 09:44:37 +03:00
Michael Basnight	323251c3d1	Merge branch '7.x' into enrich-7.x	2019-05-21 16:51:42 -05:00
David Kyle	7e4d3c695b	[ML Data Frame] Persist and restore checkpoint and position (#41942 ) Persist and restore Data frame's current checkpoint and position	2019-05-21 18:57:13 +01:00
David Kyle	ffefc66260	Mute failing AsyncTwoPhaseIndexerTests See https://github.com/elastic/elasticsearch/issues/42084	2019-05-21 10:24:46 +01:00
David Kyle	24144aead2	[ML] Complete the Data Frame task on stop (#41752 ) (#42063 ) Wait for indexer to stop then complete the persistent task on stop. If the wait_for_completion is true the request will not return until stopped.	2019-05-21 10:24:20 +01:00
Zachary Tong	6ae6f57d39	[7.x Backport] Force selection of calendar or fixed intervals (#41906 ) The date_histogram accepts an interval which can be either a calendar interval (DST-aware, leap seconds, arbitrary length of months, etc) or fixed interval (strict multiples of SI units). Unfortunately this is inferred by first trying to parse as a calendar interval, then falling back to fixed if that fails. This leads to confusing arrangement where `1d` == calendar, but `2d` == fixed. And if you want a day of fixed time, you have to specify `24h` (e.g. the next smallest unit). This arrangement is very error-prone for users. This PR adds `calendar_interval` and `fixed_interval` parameters to any code that uses intervals (date_histogram, rollup, composite, datafeed, etc). Calendar only accepts calendar intervals, fixed accepts any combination of units (meaning `1d` can be used to specify `24h` in fixed time), and both are mutually exclusive. The old interval behavior is deprecated and will throw a deprecation warning. It is also mutually exclusive with the two new parameters. In the future the old dual-purpose interval will be removed. The change applies to both REST and java clients.	2019-05-20 12:07:29 -04:00
Ioannis Kakavas	b4a413c4d0	Hash token values for storage (#41792 ) (#42220 ) This commit changes how access tokens and refresh tokens are stored in the tokens index. Access token values are now hashed before being stored in the id field of the `user_token` and before becoming part of the token document id. Refresh token values are hashed before being stored in the token field of the `refresh_token`. The tokens are hashed without a salt value since these are v4 UUID values that have enough entropy themselves. Both rainbow table attacks and offline brute force attacks are impractical. As a side effect of this change and in order to support multiple concurrent refreshes as introduced in #39631, upon refreshing an <access token, refresh token> pair, the superseding access token and refresh tokens values are stored in the superseded token doc, encrypted with a key that is derived from the superseded refresh token. As such, subsequent requests to refresh the same token in the predefined time window will return the same superseding access token and refresh token values, without hitting the tokens index (as this only stores hashes of the token values). AES in GCM mode is used for encrypting the token values and the key derivation from the superseded refresh token uses a small number of iterations as it needs to be quick. For backwards compatibility reasons, the new behavior is only enabled when all nodes in a cluster are in the required version so that old nodes can cope with the token values in a mixed cluster during a rolling upgrade.	2019-05-20 17:55:29 +03:00
Jay Modi	dbbdcea128	Update ciphers for TLSv1.3 and JDK11 if available (#42082 ) This commit updates the default ciphers and TLS protocols that are used when the runtime JDK supports them. New cipher support has been introduced in JDK 11 and 12 along with performance fixes for AES GCM. The ciphers are ordered with PFS ciphers being most preferred, then AEAD ciphers, and finally those with mainstream hardware support. When available stronger encryption is preferred for a given cipher. This is a backport of #41385 and #41808. There are known JDK bugs with TLSv1.3 that have been fixed in various versions. These are: 1. The JDK's bundled HttpsServer will endless loop under JDK11 and JDK 12.0 (Fixed in 12.0.1) based on the way the Apache HttpClient performs a close (half close). 2. In all versions of JDK 11 and 12, the HttpsServer will endless loop when certificates are not trusted or another handshake error occurs. An email has been sent to the openjdk security-dev list and #38646 is open to track this. 3. In JDK 11.0.2 and prior there is a race condition with session resumption that leads to handshake errors when multiple concurrent handshakes are going on between the same client and server. This bug does not appear when client authentication is in use. This is JDK-8213202, which was fixed in 11.0.3 and 12.0. 4. In JDK 11.0.2 and prior there is a bug where resumed TLS sessions do not retain peer certificate information. This is JDK-8212885. The way these issues are addressed is that the current java version is checked and used to determine the supported protocols for tests that provoke these issues.	2019-05-20 09:45:36 -04:00
Martijn van Groningen	855f5cc6a5	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-05-20 12:16:57 +02:00
Ed Savage	a68b04e47b	[ML] Improve hard_limit audit message (#42086 ) Improve the hard_limit memory audit message by reporting how many bytes over the configured memory limit the job was at the point of the last allocation failure. Previously the model memory usage was reported, however this was inaccurate and hence of limited use - primarily because the total memory used by the model can decrease significantly after the models status is changed to hard_limit but before the model size stats are reported from autodetect to ES. While this PR contains the changes to the format of the hard_limit audit message it is dependent on modifications to the ml-cpp backend to send additional data fields in the model size stats message. These changes will follow in a subsequent PR. It is worth noting that this PR must be merged prior to the ml-cpp one, to keep CI tests happy.	2019-05-17 17:40:08 -04:00
Tim Vernum	9191b02213	Enforce transport TLS on Basic with Security (#42150 ) If a basic license enables security, then we should also enforce TLS on the transport interface. This was already the case for Standard/Gold/Platinum licenses. For Basic, security defaults to disabled, so some of the process around checking whether security is actuallY enabled is more complex now that we need to account for basic licenses.	2019-05-15 13:59:27 -04:00
Martijn van Groningen	2daf568565	Tidy up EnrichPolicy class (#41877 )	2019-05-12 20:58:09 +02:00
Gordon Brown	a85189a558	Remove toStepKeys from LifecycleAction (#41775 ) The `toStepKeys()` method was only called in its own test case. The real list of StepKeys that's used in action execution is generated from the list of actual step objects returned by `toSteps()`. This commit removes that method.	2019-05-10 16:06:42 -06:00
Benjamin Trent	febee07dcc	[ML] adding pivot.max_search_page_size option for setting paging size (#41920 ) (#42079 ) * [ML] adding pivot.size option for setting paging size * Changing field name to address PR comments * fixing ctor usage * adjust hlrc for field name change	2019-05-10 13:22:31 -05:00
Benjamin Trent	b23b06dded	[ML] verify that there are no duplicate leaf fields in aggs (#41895 ) (#42025 ) * [ML] verify that there are no duplicate leaf fields in aggs * addressing pr comments * addressing PR comments * optmizing duplication check	2019-05-09 14:29:10 -05:00
Martijn van Groningen	57a4614a7b	Keep track of the enrich key field in the enrich index. (#42022 ) The enrich key field is being kept track in _meta field by the policy runner. The ingest processor uses the field name defined in enrich index _meta field and not in the policy. This will avoid problems if policy is changed without a new enrich index being created. This also complete decouples EnrichPolicy from ExactMatchProcessor. The following scenario results in failure without this change: 1) Create policy 2) Execute policy 3) Create pipeline with enrich processor 4) Use pipeline 5) Update enrich key in policy 6) Use pipeline, which then fails.	2019-05-09 21:28:48 +02:00
Michael Basnight	202a840da9	Merge remote-tracking branch 'upstream/7.x' into enrich-7.x	2019-05-08 13:59:01 -05:00
David Kyle	ba9d2ccc1f	[ML Data Frame] Set executing nodes in task actions (#41798 ) Direct the task request to the node executing the task and also refactor the task responses so all errors are returned and set the HTTP status code based on presence of errors.	2019-05-08 12:25:36 +01:00
Ioannis Kakavas	58041f3fdb	Remove op.name configuration setting (#41445 ) This setting was not eventually used in the realm and thus can be removed	2019-05-07 19:01:55 +03:00
Martijn van Groningen	d709b8bb97	Rename enrich policy index_pattern field to indices. (#41836 ) Relates to #32789	2019-05-07 09:08:28 +02:00
Martijn van Groningen	f366f56f00	Change policy runner to use helper method on EnrichPolicy instead of (#41839 ) its own helper method to determine alias / policy base name. This way both the enrich processor and policy runner use the same logic to determine the alias to use. Relates to #32789	2019-05-07 08:55:39 +02:00
Benjamin Trent	50fc27e9a0	[ML] addresses preview bug, and adds check to PUT (#41803 ) (#41850 )	2019-05-06 10:56:26 -05:00
Hendrik Muhs	0c03707704	[ML-DataFrame] reset/clear the position after indexer is done (#41736 ) reset/clear the position after indexer is done	2019-05-06 09:41:51 +02:00
Tim Vernum	ee84038699	Update security acknowledgement messages for basic (#41825 ) When applying a license update, we provide "acknowledgement messages" that indicate which features will be affected by the change in license. This commit updates the messages that are provided when installing a basic license, so that they reflect the changes made to the security features that are included in that license type. Backport of: #41776	2019-05-06 16:40:38 +10:00
Hendrik Muhs	00af42fefe	move checkpoints into x-pack core and introduce base classes for data frame tests (#41783 ) move checkpoints into x-pack core and introduce base classes for data frame tests	2019-05-03 14:16:25 +02:00
Hendrik Muhs	befe2a45b9	[ML-DataFrame] refactor pivot to only take the pivot config (#41763 ) refactor pivot class to only take the config at construction, other parameters are passed in as part of method that require them	2019-05-03 13:37:51 +02:00
Jason Tedor	d0f071236a	Simplify filtering addresses on interfaces (#41758 ) This commit is a refactoring of how we filter addresses on interfaces. In particular, we refactor all of these methods into a common private method. We also change the order of logic to first check if an address matches our filter and then check if the interface is up. This is to possibly avoid problems we are seeing where devices are flapping up and down while we are checking for loopback addresses. We do not expect the loopback device to flap up and down so by reversing the logic here we avoid that problem on CI machines. Finally, we expand the error message when this does occur so that we know which device is flapping.	2019-05-02 16:36:27 -04:00
Michael Basnight	5d53706310	Add enrich policy DELETE API (#41495 ) This commit wires up the Rest calls and Transport calls for DELETE enrich policy, as well as tests and rest spec additions.	2019-05-02 11:02:49 -05:00
Michael Basnight	2978ac3061	Add enrich policy list API (#41553 ) This commit wires up the Rest calls and Transport calls for listing all enrich policies, as well as tests and rest spec additions.	2019-05-02 11:01:26 -05:00
Martijn van Groningen	e429cd7f28	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-05-02 09:40:25 +02:00
Tim Vernum	0ee16d0115	Security on Basic License This adds support for using security on a basic license. It includes: - AllowedRealmType.NATIVE realms (reserved, native, file) - Roles / RBAC - TLS (already supported) It does not support: - Audit - IP filters - Token Service & API Keys - Advanced realms (AD, LDAP, SAML, etc) - Advanced roles (DLS, FLS) - Pluggable security As with trial licences, security is disabled by default. This commit does not include any new automated tests, but existing tests have been updated.	2019-05-01 14:00:25 -04:00
Jason Tedor	7f3ab4524f	Bump 7.x branch to version 7.2.0 This commit adds the 7.2.0 version constant to the 7.x branch, and bumps BWC logic accordingly.	2019-05-01 13:38:57 -04:00
Albert Zaharovits	990be1f806	Security Tokens moved to a new separate index (#40742 ) This commit introduces the `.security-tokens` and `.security-tokens-7` alias-index pair. Because index snapshotting is at the index level granularity (ie you cannot snapshot a subset of an index) snapshoting .`security` had the undesirable effect of storing ephemeral security tokens. The changes herein address this issue by moving tokens "seamlessly" (without user intervention) to another index, so that a "Security Backup" (ie snapshot of `.security`) would not be bloated by ephemeral data.	2019-05-01 14:53:56 +03:00
Martijn van Groningen	8838bcc776	Add enrich processor (#41532 ) The enrich processor performs a lookup in a locally allocated enrich index shard using a field value from the document being enriched. If there is a match then the _source of the enrich document is fetched. The document being enriched then gets the decorate values from the enrich document based on the configured decorate fields in the pipeline. Note that the usage of the _source field is temporary until the enrich source field that is part of #41521 is merged into the enrich branch. Using the _source field involves significant decompression which not desired for enrich use cases. The policy contains the information what field in the enrich index to query and what fields are available to decorate a document being enriched with. The enrich processor has the following configuration options: * `policy_name` - the name of the policy this processor should use * `enrich_key` - the field in the document being enriched that holds to lookup value * `ignore_missing` - Whether to allow the key field to be missing * `enrich_values` - a list of fields to decorate the document being enriched with. Each entry holds a source field and a target field. The source field indicates what decorate field to use that is available in the policy. The target field controls the field name to use in the document being enriched. The source and target fields can be the same. Example pipeline config: ``` { "processors": [ { "policy_name": "my_policy", "enrich_key": "host_name", "enrich_values": [ { "source": "globalRank", "target": "global_rank" } ] } ] } ``` In the above example documents are being enriched with a global rank value. For each document that has match in the enrich index based on its host_name field, the document gets an global rank field value, which is fetched from the `globalRank` field in the enrich index and saved as `global_rank` in the document being enriched. This is PR is part one of #41521	2019-04-30 20:51:13 +02:00
Benjamin Trent	92a820bc1a	[ML] Add bucket_script agg support to data frames (#41594 ) (#41639 )	2019-04-29 10:14:17 -05:00
Martijn van Groningen	eb9618f1b7	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-04-29 09:21:04 +02:00
Yogesh Gaikwad	c0d40ae4ca	Remove deprecated stashWithOrigin calls and use the alternative (#40847 ) (#41562 ) This commit removes the deprecated `stashWithOrigin` and modifies its usage to use the alternative.	2019-04-28 21:25:42 +10:00
Benjamin Trent	a0990ca239	[ML] cleanup + adding description field to transforms (#41554 ) (#41605 ) * [ML] cleanup + adding description field to transforms * making description length have a max of 1k	2019-04-26 16:50:59 -05:00
Chris Earle	858e7f4a62	[7.x] [Monitoring] Add `usage` mapping for `monitoring-kibana` index (#40899 ) (#41601 ) Backports the usage change to 7.x. A separate backport is needed to change the version for 6.7, which will complete this backport.	2019-04-26 16:44:03 -04:00
Armin Braun	aad33121d8	Async Snapshot Repository Deletes (#40144 ) (#41571 ) Motivated by slow snapshot deletes reported in e.g. #39656 and the fact that these likely are a contributing factor to repositories accumulating stale files over time when deletes fail to finish in time and are interrupted before they can complete. * Makes snapshot deletion async and parallelizes some steps of the delete process that can be safely run concurrently via the snapshot thread poll * I did not take the biggest potential speedup step here and parallelize the shard file deletion because that's probably better handled by moving to bulk deletes where possible (and can still be parallelized via the snapshot pool where it isn't). Also, I wanted to keep the size of the PR manageable. * See https://github.com/elastic/elasticsearch/pull/39656#issuecomment-470492106 * Also, as a side effect this gives the `SnapshotResiliencyTests` a little more coverage for master failover scenarios (since parallel access to a blob store repository during deletes is now possible since a delete isn't a single task anymore). * By adding a `ThreadPool` reference to the repository this also lays the groundwork to parallelizing shard snapshot uploads to improve the situation reported in #39657	2019-04-26 15:36:09 +02:00
Benjamin Trent	4836ff7bcd	[ML] add multi node integ tests for data frames (#41508 ) (#41552 ) * [ML] adding native-multi-node-integTests for data frames' * addressing streaming issues * formatting fixes * Addressing PR comments	2019-04-26 07:18:49 -05:00
Michael Basnight	fad45ea6bd	Add enrich policy PUT API (#41383 ) This commit wires up the Rest calls and Transport calls for PUT enrich policy, as well as tests and rest spec additions.	2019-04-25 15:15:25 -05:00
Christoph Büscher	52495843cc	[Docs] Fix common word repetitions (#39703 )	2019-04-25 20:47:47 +02:00
Benjamin Trent	08843ba62b	[ML] Adds progress reporting for transforms (#41278 ) (#41529 ) * [ML] Adds progress reporting for transforms * fixing after master merge * Addressing PR comments * removing unused imports * Adjusting afterKey handling and percentage to be 100* * Making sure it is a linked hashmap for serialization * removing unused import * addressing PR comments * removing unused import * simplifying code, only storing total docs and decrementing * adjusting for rewrite * removing initial progress gathering from executor	2019-04-25 11:23:12 -05:00
Albert Zaharovits	fe5789ada1	Fix Has Privilege API check on restricted indices (#41226 ) The Has Privileges API allows to tap into the authorization process, to validate privileges without actually running the operations to be authorized. This commit fixes a bug, in which the Has Privilege API returned spurious results when checking for index privileges over restricted indices (currently .security, .security-6, .security-7). The actual authorization process is not affected by the bug.	2019-04-25 12:03:27 +03:00
Hendrik Muhs	13d720f4a7	[ML-DataFrame] Replace Streamable w/ Writeable (#41477 ) replace usages of Streamable with Writeable Relates to #36176	2019-04-24 17:17:34 +02:00
Benjamin Trent	07d36fdb23	[ML] refactoring the ML plugin to use the common auditor code (#41419 ) (#41485 )	2019-04-24 09:56:59 -05:00
Michael Basnight	38e6dcd388	Merge remote-tracking branch 'upstream/7.x' into enrich-7.x	2019-04-23 20:38:28 -05:00
Benjamin Trent	e2f8ffdde8	[ML][Data Frame] Moving destination creation to _start (#41416 ) (#41433 ) * [ML][Data Frame] Moving destination creation to _start * slight refactor of DataFrameAuditor constructor	2019-04-23 09:32:57 -05:00
Michael Basnight	860e783f14	Merge remote-tracking branch 'upstream/7.x' into enrich-7.x	2019-04-22 09:39:28 -05:00
Jason Tedor	5c40fc9ba5	Remove script engine from X-Pack plugin (#41387 ) The X-Pack plugin implements ScriptEngine yet it does not actually implement any of the methods on the interface, effectively making this a no-op. This commit removes this interface from the X-Pack plugin.	2019-04-20 08:23:20 -04:00
Martijn van Groningen	d99249c624	Move the policy class to xpack core module. (#41311 ) This allows the transport client use this class in enrich APIs. Relates to #40997	2019-04-18 09:02:43 +02:00
Gordon Brown	66366d0307	Extract template management from Watcher (#41169 ) This commit extracts the template management from Watcher into an abstract class, so that templates and lifecycle policies can be managed in the same way across multiple plugins. This will be useful for SLM, as well as potentially ILM and any other plugins which need to manage index templates.	2019-04-17 13:42:36 -06:00
Zachary Tong	7e62ff2823	[Rollup] Validate timezones based on rules not string comparision (#36237 ) The date_histogram internally converts obsolete timezones (such as "Canada/Mountain") into their modern equivalent ("America/Edmonton"). But rollup just stored the TZ as provided by the user. When checking the TZ for query validation we used a string comparison, which would fail due to the date_histo's upgrading behavior. Instead, we should convert both to a TimeZone object and check if their rules are compatible.	2019-04-17 13:46:44 -04:00
Iana Bondarska	e090176f17	[ML] Exclude analysis fields with core field names from anomaly results (#41093 ) Added "_index", "_type", "_id" to list of reserved fields. Closes #39406	2019-04-17 16:08:03 +01:00
David Kyle	1d2365f5b6	[ML-DataFrame] Refactorings and tidying (#41248 ) Remove unnecessary generic params from SingleGroupSource and unused code from the HLRC	2019-04-17 14:58:26 +01:00
Hendrik Muhs	02247cc7df	[ML-DataFrame] adapt page size on circuit breaker responses (#41149 ) handle circuit breaker response and adapt page size to reduce memory pressure, reduce preview buckets to 100, initial page size to 500	2019-04-16 19:49:43 +02:00
Shaunak Kashyap	750db02b54	Expand beats_system role privileges (#40876 ) (#41232 ) Traditionally we have [recommended](https://www.elastic.co/guide/en/beats/filebeat/current/monitoring.html) that Beats send their monitoring data to the production Elasticsearch cluster. Beats do this by calling the `POST _monitoring/bulk` API. When Security is enabled this API call requires the `cluster:admin/xpack/monitoring/bulk` privilege. The built-in `beats_system` role has this privilege. [Going forward](https://github.com/elastic/beats/pull/9260), Beats will be able to send their monitoring data directly to the monitoring Elasticsearch cluster. Beats will do this by calling the regular `POST _bulk` API. When Security is enabled this API call requires the `indices:data/write/bulk` privilege. Further, the call has to be able to create any indices that don't exist. This PR expands the built-in `beats_system` role's privileges. Specifically, it adds index-level `write` and `create_index` privileges for `.monitoring-beats-*` indices. This will allow Beats users to continue using the `beats_system` role for the new direct monitoring route when Security is enabled.	2019-04-15 20:17:05 -07:00
Martijn van Groningen	f56b2ecb37	Remove xpack dependencies from qa rest modules (#41134 ) (7.x backport) (#41202 ) This commit removes xpack dependencies of many xpack qa modules. (for some qa modules this will require some more work) The reason behind this change is that qa rest modules should not depend on the x-pack plugins, because the plugins are an implementation detail and the tests should only know about the rest interface and qa cluster that is being tested. Also some qa modules rely on xpack plugins and hlrc (which is a valid dependency for rest qa tests) creates a cyclic dependency and this is something that we should avoid. Also Eclipse can't handle gradle cyclic dependencies (see #41064). * don't copy xpack-core's plugin property into the test resource of qa modules. Otherwise installing security manager fails, because it tries to find the XPackPlugin class.	2019-04-15 19:14:43 +02:00

... 6 7 8 9 10 ...

1765 Commits