OpenSearch

Commit Graph

Author	SHA1	Message	Date
Hendrik Muhs	e974f178b5	[Transform] rename data frame transform to transform for hlrc client (#46933 ) rename data frame transform to transform for hlrc	2019-09-25 08:31:43 +02:00
Martijn van Groningen	0cfddca61d	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-23 09:46:05 +02:00
Lisa Cawley	875d864be6	[DOCS] Update data frame transform URLs (#46940 ) (#46946 )	2019-09-20 15:57:43 -07:00
Hendrik Muhs	abe889af75	[7.5][Transform] rename classes in transform plugin (#46867 ) rename classes and settings in transform plugin, provide BWC for old settings	2019-09-20 10:43:00 +02:00
Benjamin Trent	9cf9c64ec2	[7.x] [ML][Transforms] remove `force` flag from _start (#46414 ) (#46748 ) * [ML][Transforms] remove `force` flag from _start (#46414) * [ML][Transforms] remove `force` flag from _start * fixing expected error message * adjusting bwc version	2019-09-18 10:06:05 -04:00
Tomas Della Vedova	e1cf103980	Fixes for API specification (#46522 ) (#46736 ) Follow-up of #42346	2019-09-17 11:49:24 +02:00
Benjamin Trent	92acc732de	[ML][Transform] Use field caps for mapping deductino (#46703 ) (#46742 )	2019-09-16 10:05:55 -04:00
Martijn van Groningen	a4b0f66919	Add enrich stats api (#46462 ) The enrich api returns enrich coordinator stats and information about currently executing enrich policies. The coordinator stats include per ingest node: * The current number of search requests in the queue. * The total number of outstanding remote requests that have been executed since node startup. Each remote request is likely to include multiple search requests. This depends on how much search requests are in the queue at the time when the remote request is performed. * The number of current outstanding remote requests. * The total number of search requests that `enrich` processors have executed since node startup. The current execution policies stats include: * The name of policy that is executing * A full blow task info object that is executing the policy. Relates to #32789	2019-09-11 13:40:24 +02:00
Przemysław Witek	e38e631dac	[7.x] Implement DataFrameAnalyticsAuditMessage and DataFrameAnalyticsAuditor (#45967 ) (#46519 )	2019-09-11 12:17:26 +02:00
Lee Hinman	cdc3a260af	Add retention to Snapshot Lifecycle Management (backport of #4… (#46506 ) * Add retention to Snapshot Lifecycle Management (#46407) This commit adds retention to the existing Snapshot Lifecycle Management feature (#38461) as described in #43663. This allows a user to configure SLM to automatically delete older snapshots based on a number of criteria. An example policy would look like: ``` PUT /_slm/policy/snapshot-every-day { "schedule": "0 30 2 * * ?", "name": "<production-snap-{now/d}>", "repository": "my-s3-repository", "config": { "indices": ["foo-", "important"] }, // Newly configured retention options "retention": { // Snapshots should be deleted after 14 days "expire_after": "14d", // Keep a maximum of thirty snapshots "max_count": 30, // Keep a minimum of the four most recent snapshots "min_count": 4 } } ``` SLM Retention is run on a scheduled configurable with the `slm.retention_schedule` setting, which supports cron expressions. Deletions are run for a configurable time bounded by the `slm.retention_duration` setting, which defaults to 1 hour. Included in this work is a new SLM stats API endpoint available through ``` json GET /_slm/stats ``` That returns statistics about snapshot taken and deleted, as well as successful retention runs, failures, and the time spent deleting snapshots. #45362 has more information as well as an example of the output. These stats are also included when retrieving SLM policies via the API. Add base framework for snapshot retention (#43605) * Add base framework for snapshot retention This adds a basic `SnapshotRetentionService` and `SnapshotRetentionTask` to start as the basis for SLM's retention implementation. Relates to #38461 * Remove extraneous 'public' * Use a local var instead of reading class var repeatedly * Add SnapshotRetentionConfiguration for retention configuration (#43777) * Add SnapshotRetentionConfiguration for retention configuration This commit adds the `SnapshotRetentionConfiguration` class and its HLRC counterpart to encapsulate the configuration for SLM retention. Currently only a single parameter is supported as an example (we still need to discuss the different options we want to support and their names) to keep the size of the PR down. It also does not yet include version serialization checks since the original SLM branch has not yet been merged. Relates to #43663 * Fix REST tests * Fix more documentation * Use Objects.equals to avoid NPE * Put `randomSnapshotLifecyclePolicy` in only one place * Occasionally return retention with no configuration * Implement SnapshotRetentionTask's snapshot filtering and delet… (#44764) * Implement SnapshotRetentionTask's snapshot filtering and deletion This commit implements the snapshot filtering and deletion for `SnapshotRetentionTask`. Currently only the expire-after age is used for determining whether a snapshot is eligible for deletion. Relates to #43663 * Fix deletes running on the wrong thread * Handle missing or null policy in snap metadata differently * Convert Tuple<String, List<SnapshotInfo>> to Map<String, List<SnapshotInfo>> * Use the `OriginSettingClient` to work with security, enhance logging * Prevent NPE in test by mocking Client * Allow empty/missing SLM retention configuration (#45018) Semi-related to #44465, this allows the `"retention"` configuration map to be missing. Relates to #43663 * Add min_count and max_count as SLM retention predicates (#44926) This adds the configuration options for `min_count` and `max_count` as well as the logic for determining whether a snapshot meets this criteria to SLM's retention feature. These options are optional and one, two, or all three can be specified in an SLM policy. Relates to #43663 * Time-bound deletion of snapshots in retention delete function (#45065) * Time-bound deletion of snapshots in retention delete function With a cluster that has a large number of snapshots, it's possible that snapshot deletion can take a very long time (especially since deletes currently have to happen in a serial fashion). To prevent snapshot deletion from taking forever in a cluster and blocking other operations, this commit adds a setting to allow configuring a maximum time to spend deletion snapshots during retention. This dynamic setting defaults to 1 hour and is best-effort, meaning that it doesn't hard stop a deletion at an hour mark, but ensures that once the time has passed, all subsequent deletions are deferred until the next retention cycle. Relates to #43663 * Wow snapshots suuuure can take a long time. * Use a LongSupplier instead of actually sleeping * Remove TestLogging annotation * Remove rate limiting * Add SLM metrics gathering and endpoint (#45362) * Add SLM metrics gathering and endpoint This commit adds the infrastructure to gather metrics about the different SLM actions that a cluster takes. These actions are stored in `SnapshotLifecycleStats` and perpetuated in cluster state. The stats stored include the number of snapshots taken, failed, deleted, the number of retention runs, as well as per-policy counts for snapshots taken, failed, and deleted. It also includes the amount of time spent deleting snapshots from SLM retention. This commit also adds an endpoint for retrieving all stats (further commits will expose this in the SLM get-policy API) that looks like: ``` GET /_slm/stats { "retention_runs" : 13, "retention_failed" : 0, "retention_timed_out" : 0, "retention_deletion_time" : "1.4s", "retention_deletion_time_millis" : 1404, "policy_metrics" : { "daily-snapshots2" : { "snapshots_taken" : 7, "snapshots_failed" : 0, "snapshots_deleted" : 6, "snapshot_deletion_failures" : 0 }, "daily-snapshots" : { "snapshots_taken" : 12, "snapshots_failed" : 0, "snapshots_deleted" : 12, "snapshot_deletion_failures" : 6 } }, "total_snapshots_taken" : 19, "total_snapshots_failed" : 0, "total_snapshots_deleted" : 18, "total_snapshot_deletion_failures" : 6 } ``` This does not yet include HLRC for this, as this commit is quite large on its own. That will be added in a subsequent commit. Relates to #43663 * Version qualify serialization * Initialize counters outside constructor * Use computeIfAbsent instead of being too verbose * Move part of XContent generation into subclass * Fix REST action for master merge * Unused import * Record history of SLM retention actions (#45513) This commit records the deletion of snapshots by the retention component of SLM into the SLM history index for the purposes of reviewing operations taken by SLM and alerting. * Retry SLM retention after currently running snapshot completes (#45802) * Retry SLM retention after currently running snapshot completes This commit adds a ClusterStateObserver to wait until the currently running snapshot is complete before proceeding with snapshot deletion. SLM retention waits for the maximum allowed deletion time for the snapshot to complete, however, the waiting time is not factored into the limit on actual deletions. Relates to #43663 * Increase timeout waiting for snapshot completion * Apply patch From `2374316f0d`.patch * Rename test variables * [TEST] Be less strict for stats checking * Skip SLM retention if ILM is STOPPING or STOPPED (#45869) This adds a check to ensure we take no action during SLM retention if ILM is currently stopped or in the process of stopping. Relates to #43663 * Check all actions preventing snapshot delete during retention (#45992) * Check all actions preventing snapshot delete during retention run Previously we only checked to see if a snapshot was currently running, but it turns out that more things can block snapshot deletion. This changes the check to be a check for: - a snapshot currently running - a deletion already in progress - a repo cleanup in progress - a restore currently running This was found by CI where a third party delete in a test caused SLM retention deletion to throw an exception. Relates to #43663 * Add unit test for okayToDeleteSnapshots * Fix bug where SLM retention task would be scheduled on every node * Enhance test logging * Ignore if snapshot is already deleted * Missing import * Fix SnapshotRetentionServiceTests * Expose SLM policy stats in get SLM policy API (#45989) This also adds support for the SLM stats endpoint to the high level rest client. Retrieving a policy now looks like: ```json { "daily-snapshots" : { "version": 1, "modified_date": "2019-04-23T01:30:00.000Z", "modified_date_millis": 1556048137314, "policy" : { "schedule": "0 30 1 * * ?", "name": "<daily-snap-{now/d}>", "repository": "my_repository", "config": { "indices": ["data-", "important"], "ignore_unavailable": false, "include_global_state": false }, "retention": {} }, "stats": { "snapshots_taken": 0, "snapshots_failed": 0, "snapshots_deleted": 0, "snapshot_deletion_failures": 0 }, "next_execution": "2019-04-24T01:30:00.000Z", "next_execution_millis": 1556048160000 } } ``` Relates to #43663 Rewrite SnapshotLifecycleIT as as ESIntegTestCase (#46356) * Rewrite SnapshotLifecycleIT as as ESIntegTestCase This commit splits `SnapshotLifecycleIT` into two different tests. `SnapshotLifecycleRestIT` which includes the tests that do not require slow repositories, and `SLMSnapshotBlockingIntegTests` which is now an integration test using `MockRepository` to simulate a snapshot being in progress. Relates to #43663 Resolves #46205 * Add error logging when exceptions are thrown * Update serialization versions * Fix type inference * Use non-Cancellable HLRC return value * Fix Client mocking in test * Fix SLMSnapshotBlockingIntegTests for 7.x branch * Update SnapshotRetentionTask for non-multi-repo snapshot retrieval * Add serialization guards for SnapshotLifecyclePolicy	2019-09-10 09:08:09 -06:00
Martijn van Groningen	c057fce978	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-09 08:40:54 +02:00
David Roberts	7c7fb7e32d	[ML] Tolerate total_search_time_ms not mapped in get datafeed stats (#46432 ) ML users who upgrade from versions prior to 7.4 to 7.4 or later will have ML results indices that do not have mappings for the total_search_time_ms field. Therefore, when searching these indices we must tolerate this field not having a mapping. Fixes #46437	2019-09-06 14:31:15 +01:00
Julie Tibshirani	40c3225d26	First round of optimizations for vector functions. (#46294 ) This PR merges the `vectors-optimize-brute-force` feature branch, which makes the following changes to how vector functions are computed: * Precompute the L2 norm of each vector at indexing time. (#45390) * Switch to ByteBuffer for vector encoding. (#45936) * Decode vectors and while computing the vector function. (#46103) * Use an array instead of a List for the query vector. (#46155) * Precompute the normalized query vector when using cosine similarity. (#46190) Co-authored-by: Mayya Sharipova <mayya.sharipova@elastic.co>	2019-09-04 14:45:57 -07:00
Martijn van Groningen	555b630160	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-09-02 09:16:55 +02:00
Zachary Tong	cf8a4171e1	Rename `data-science` plugin to `analytics` (#46133 ) Rename `data-science` plugin to `analytics`. Also removes enabled flag. Backport of #46092	2019-08-29 12:45:39 -04:00
Julie Tibshirani	d94c4dcffb	Use float instead of double for query vectors. (#46004 ) Currently, when using script_score functions like cosineSimilarity, the query vector is treated as an array of doubles. Since the stored document vectors use floats, it seems like the least surprising behavior for the query vectors to also be float arrays. In addition to improving consistency, this change may help with some optimizations we have been considering around vector dot product.	2019-08-28 11:03:14 -07:00
Martijn van Groningen	1157224a6b	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-28 10:14:07 +02:00
Dimitris Athanasiou	873ad3f942	[7.x][ML] Add option to regression to randomize training set (#45969 ) (#46017 ) Adds a parameter `training_percent` to regression. The default value is `100`. When the parameter is set to a value less than `100`, from the rows that can be used for training (ie. those that have a value for the dependent variable) we randomly choose whether to actually use for training. This enables splitting the data into a training set and the rest, usually called testing, validation or holdout set, which allows for validating the model on data that have not been used for training. Technically, the analytics process considers as training the data that have a value for the dependent variable. Thus, when we decide a training row is not going to be used for training, we simply clear the row's dependent variable.	2019-08-27 17:53:11 +03:00
Yogesh Gaikwad	7b6246ec67	Add `manage_own_api_key` cluster privilege (#45897 ) (#46023 ) The existing privilege model for API keys with privileges like `manage_api_key`, `manage_security` etc. are too permissive and we would want finer-grained control over the cluster privileges for API keys. Previously APIs created would also need these privileges to get its own information. This commit adds support for `manage_own_api_key` cluster privilege which only allows api key cluster actions on API keys owned by the currently authenticated user. Also adds support for retrieval of the API key self-information when authenticating via API key without the need for the additional API key privileges. To support this privilege, we are introducing additional authentication context along with the request context such that it can be used to authorize cluster actions based on the current user authentication. The API key get and invalidate APIs introduce an `owner` flag that can be set to true if the API key request (Get or Invalidate) is for the API keys owned by the currently authenticated user only. In that case, `realm` and `username` cannot be set as they are assumed to be the currently authenticated ones. The changes cover HLRC changes, documentation for the API changes. Closes #40031	2019-08-28 00:44:23 +10:00
Dimitris Athanasiou	dd6c13fdf9	[ML] Add description to DF analytics (#45774 ) (#46019 )	2019-08-27 15:48:59 +03:00
Albert Zaharovits	1ebee5bf9b	PKI realm authentication delegation (#45906 ) This commit introduces PKI realm delegation. This feature supports the PKI authentication feature in Kibana. In essence, this creates a new API endpoint which Kibana must call to authenticate clients that use certificates in their TLS connection to Kibana. The API call passes to Elasticsearch the client's certificate chain. The response contains an access token to be further used to authenticate as the client. The client's certificates are validated by the PKI realms that have been explicitly configured to permit certificates from the proxy (Kibana). The user calling the delegation API must have the delegate_pki privilege. Closes #34396	2019-08-27 14:42:46 +03:00
Zachary Tong	943a016bb2	Add Cumulative Cardinality agg (and Data Science plugin) (#45990 ) This adds a pipeline aggregation that calculates the cumulative cardinality of a field. It does this by iteratively merging in the HLL sketch from consecutive buckets and emitting the cardinality up to that point. This is useful for things like finding the total "new" users that have visited a website (as opposed to "repeat" visitors). This is a Basic+ aggregation and adds a new Data Science plugin to house it and future advanced analytics/data science aggregations.	2019-08-26 16:19:55 -04:00
Benjamin Trent	a3a4ae0ac2	[ML] fixing bug where analytics process starts with 0 rows (#45879 ) (#45988 ) The native process requires that there be a non-zero number of rows to analyze. If the flag --rows 0 is passed to the executable, it throws and does not start. When building the configuration for the process we should not start the native process if there are no rows. Adding some logging to indicate what is occurring.	2019-08-26 14:18:17 -05:00
Martijn van Groningen	837cfa2640	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-23 11:22:27 +02:00
Nhat Nguyen	3393f9599e	Ignore translog retention policy if soft-deletes enabled (#45473 ) Since #45136, we use soft-deletes instead of translog in peer recovery. There's no need to retain extra translog to increase a chance of operation-based recoveries. This commit ignores the translog retention policy if soft-deletes is enabled so we can discard translog more quickly. Backport of #45473 Relates #45136	2019-08-22 16:40:06 -04:00
Przemysław Witek	7512337922	[7.x] Allow the user to specify 'query' in Evaluate Data Frame request (#45775 ) (#45825 )	2019-08-22 11:14:26 +02:00
Martijn van Groningen	7f2ba91360	adjusted enrich rest specs to new format	2019-08-21 14:42:10 +02:00
Martijn van Groningen	2677ac14d2	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-21 14:28:17 +02:00
Przemysław Witek	bf701b83d2	Shorten field names in EstimateMemoryUsageResponse (#45719 ) (#45772 )	2019-08-21 12:45:09 +02:00
Przemysław Witek	c6709f0979	Mute tests affected by renaming fields in Estimate memory usage response (#45743 ) (#45766 )	2019-08-21 09:57:23 +02:00
Benjamin Trent	ba7b677618	[ML] better handle empty results when evaluating regression (#45745 ) (#45759 ) * [ML] better handle empty results when evaluating regression * adding new failure test to ml_security black list * fixing equality check for regression results	2019-08-20 17:37:04 -05:00
Michael Basnight	e3373d349b	Consolidate enrich list all and get by name APIs (#45705 ) The get and list APIs are a single API in this commit. Whether requesting one named policy or all policies, a list of policies is returened. The list API code has all been removed and the GET api is what remains, which contains much of the list response code.	2019-08-20 10:29:59 -05:00
Przemysław Witek	80dd0a0948	Get rid of EstimateMemoryUsageRequest and EstimateMemoryUsageAction.Request. (#45718 ) (#45725 )	2019-08-20 15:49:17 +02:00
Luca Cavanna	c31cddf27e	Update the schema for the REST API specification (#42346 ) * Update the REST API specification This patch updates the REST API spefication in JSON files to better encode deprecated entities, to improve specification of URL paths, and to open up the schema for future extensions. Notably, it changes the `paths` from a list of strings to a list of objects, where each particular object encodes all the information for this particular path: the `parts` and the `methods`. Among the benefits of this approach is eg. encoding the difference between using the `PUT` and `POST` methods in the Index API, to either use a specific document ID, or let Elasticsearch generate one. Also `documentation` becomes an object that supports an `url` and also a `description` which is a new field. * Adapt YAML runner to new REST API specification format The logic for choosing the path to use when running tests has been simplified, as a consequence of the path parts being listed under each path in the spec. The special case for create and index has been removed. Also the parsing code has been hardened so that errors are thrown earlier when the structure of the spec differs from what expected, and their error messages should be more helpful.	2019-08-16 14:40:00 +02:00
Martijn van Groningen	5ea0985711	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-16 09:47:11 +02:00
Benjamin Trent	0c343d8443	[7.x] [ML][Transforms] adjusting stats.progress for cont. transforms (#45361 ) (#45551 ) * [ML][Transforms] adjusting stats.progress for cont. transforms (#45361) * [ML][Transforms] adjusting stats.progress for cont. transforms * addressing PR comments * rename fix * Adjusting bwc serialization versions	2019-08-14 13:08:27 -05:00
Przemysław Witek	df574e5168	[7.x] Implement ml/data_frame/analytics/_estimate_memory_usage API endpoint (#45188 ) (#45510 )	2019-08-14 08:26:03 +02:00
Martijn van Groningen	1951cdf1cb	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-13 09:12:31 +02:00
Dimitris Athanasiou	27497ff75f	[7.x][ML] Add regression analysis to DF analytics (#45292 ) (#45388 ) This commit adds a first draft of a regression analysis to data frame analytics. There is high probability that the exact syntax might change. This commit adds the new analysis type and its parameters as well as appropriate validation. It also modifies the extractor and the fields detector to be able to handle categorical fields as regression analysis supports them.	2019-08-09 19:31:13 +03:00
David Roberts	14545f8958	[ML-DataFrame] Combine task_state and indexer_state in _stats (#45324 ) This commit replaces task_state and indexer_state in the data frame _stats output with a single top level state that combines the two. It is defined as: - failed if what's currently reported as task_state is failed - stopped if there is no persistent task - Otherwise what's currently reported as indexer_state Backport of #45276	2019-08-08 16:24:26 +01:00
Martijn van Groningen	708f856940	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-08 16:52:45 +02:00
Benjamin Trent	5db9982f71	[7.x] [ML][Data Frame] Add update transform api endpoint (#45154 ) (#45279 ) * [ML][Data Frame] Add update transform api endpoint (#45154) This adds the ability to `_update` stored data frame transforms. All mutable fields are applied when the next checkpoint starts. The exception being `description`. This PR contains all that is necessary for this addition: * HLRC * Docs * Server side	2019-08-07 10:37:35 -05:00
Zachary Tong	422aca9a5d	Fix Rollup job creation to work with templates (#43943 ) The PutJob API accidentally used an "expert" API of CreateIndexRequest. That API is semi-lenient to syntax; a type could be omitted and the request would work as expected. But if a type was omitted it would not merge with templates correctly, leading to index creation that only has the template and not the requested mappings in the request. This commit refactors the PutJob API to: - Include the type name - Use a less "expert" API in an attempt to future proof against errors - Uses an XContentBuilder instead of string replacing, removes json template	2019-08-06 10:53:44 -04:00
Tomas Della Vedova	6b71621afc	Updated slm API spec parameters and URL (#44797 ) (#45102 )	2019-08-02 11:39:52 +02:00
Dimitris Athanasiou	8a6675b994	[7.x][ML] Check dest index is empty when starting DF analytics (#45094 ) (#45112 ) If one tries to start a DF analytics job that has already run, the result will be that the task will fail after reindexing the dest index from the source index. The results of the prior run will be gone and the task state is not properly set to failed with the failure reason. This commit improves the behavior in this scenario. First, we set the task state to `failed` in a set of failures that were missed. Second, a validation is added that if the destination index exists, it must be empty.	2019-08-02 00:19:48 +03:00
Martijn van Groningen	aae2f0cff2	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-08-01 13:38:03 +07:00
Mayya Sharipova	0c68765088	Adds usage stats for vectors (#45023 ) Example of usage: _xpack/usage "vectors": { "available": true, "enabled": true, "dense_vector_fields_count" : 1, "sparse_vector_fields_count" : 1, "dense_vector_dims_avg_count" : 100 } Backport for #44512	2019-07-31 12:32:41 -04:00
Benjamin Trent	3f48720d41	[ML][Data Frames] unify validation exceptions between PUT/_preview (#44983 ) (#45012 ) * [ML][Data Frames] unify validation exceptions between PUT/_preview * addressing PR comments	2019-07-30 13:05:07 -05:00
Martijn van Groningen	db49cb505e	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-07-29 14:45:10 +07:00
Gordon Brown	d4b2d21339	Add option to filter ILM explain response (#44777 ) In order to make it easier to interpret the output of the ILM Explain API, this commit adds two request parameters to that API: - `only_managed`, which causes the response to only contain indices which have `index.lifecycle.name` set - `only_errors`, which causes the response to contain only indices in an ILM error state "Error state" is defined as either being in the `ERROR` step or having `index.lifecycle.name` set to a policy that does not exist.	2019-07-26 11:57:38 -04:00
James Baiera	c5528a25e6	Merge branch '7.x' into enrich-7.x	2019-07-25 13:12:56 -04:00
Andrei Stefan	2633d11eb7	Switch from using docvalue_fields to extracting values from _source (#44062 ) (#44804 ) * Switch from using docvalue_fields to extracting values from _source where applicable. Doing this means parsing the _source and handling the numbers parsing just like Elasticsearch is doing it when it's indexing a document. * This also introduces a minor limitation: aliases type of fields that are NOT part of a tree of sub-fields will not be able to be retrieved anymore. field_caps API doesn't shed any light into a field being an alias or not and at _source parsing time there is no way to know if a root field is an alias or not. Fields of the type "a.b.c.alias" can be extracted from docvalue_fields, only if the field they point to can be extracted from docvalue_fields. Also, not all fields in a hierarchy of fields can be evaluated to being an alias. (cherry picked from commit 8bf8a055e38f00df5f49c8d97f632f69d6e00c2c)	2019-07-25 10:02:41 +03:00
Przemysław Witek	26da573e94	[ML] [7.x] Only emit deprecation warning if there was actual change of a datafeed's job_id. (#44755 ) * Only emit deprecation warning if there was actual change of a datafeed's job_id. * Add @Deprecated annotation to DatafeedUpdate.Builder#setJobId method	2019-07-24 10:03:25 +02:00
David Roberts	caf9411a72	[ML] Improve response format of data frame stats endpoint (#44743 ) This change adjusts the data frame transforms stats endpoint to return a structure that is easier to understand. This is a breaking change for clients of the data frame transforms stats endpoint, but the feature is in beta so stability is not guaranteed. Backport of #44350	2019-07-23 18:00:50 +01:00
Przemysław Witek	16c8e18013	Deprecate the ability to update datafeed's job_id. (#44691 ) (#44742 )	2019-07-23 14:48:56 +02:00
Benjamin Trent	4456850a8e	[7.x] [ML][Data Frame] Add optional defer_validation param to PUT (#44455 ) (#44697 ) * [ML][Data Frame] Add optional defer_validation param to PUT (#44455) * [ML][Data Frame] Add optional defer_validation param to PUT * addressing PR comments * reverting bad replace * addressing pr comments * Update put-transform.asciidoc * Update put-transform.asciidoc * Update put-transform.asciidoc * adjusting for backport * fixing imports * [DOCS] Fixes formatting in create data frame transform API	2019-07-22 15:12:55 -05:00
Benjamin Trent	06e21f7902	[7.x] [ML][Data Frame] adding force delete (#44590 ) (#44696 ) * [ML][Data Frame] adding force delete (#44590) * [ML][Data Frame] adding force delete * Update delete-transform.asciidoc * adjusting for backport	2019-07-22 13:13:25 -05:00
Yannick Welsch	d98b3e4760	Move frozen indices to x-pack module (#44490 ) Backport of #44408 and #44286.	2019-07-17 16:53:10 +02:00
Benjamin Trent	2c7ff812da	[ML] Add r_squared eval metric to regression (#44248 ) (#44378 ) * [ML] Add r_squared eval metric to regression * fixing tests and binarysoftclassification class * Update RSquared.java * Update x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ml/dataframe/evaluation/regression/RSquared.java Co-Authored-By: David Kyle <david.kyle@elastic.co> * removing unnecessary debug test	2019-07-16 11:11:31 -05:00
Lee Hinman	fb0461ac76	[7.x] Add Snapshot Lifecycle Management (#44382 ) * Add Snapshot Lifecycle Management (#43934) * Add SnapshotLifecycleService and related CRUD APIs This commit adds `SnapshotLifecycleService` as a new service under the ilm plugin. This service handles snapshot lifecycle policies by scheduling based on the policies defined schedule. This also includes the get, put, and delete APIs for these policies Relates to #38461 * Make scheduledJobIds return an immutable set * Use Object.equals for SnapshotLifecyclePolicy * Remove unneeded TODO * Implement ToXContentFragment on SnapshotLifecyclePolicyItem * Copy contents of the scheduledJobIds * Handle snapshot lifecycle policy updates and deletions (#40062) (Note this is a PR against the `snapshot-lifecycle-management` feature branch) This adds logic to `SnapshotLifecycleService` to handle updates and deletes for snapshot policies. Policies with incremented versions have the old policy cancelled and the new one scheduled. Deleted policies have their schedules cancelled when they are no longer present in the cluster state metadata. Relates to #38461 * Take a snapshot for the policy when the SLM policy is triggered (#40383) (This is a PR for the `snapshot-lifecycle-management` branch) This commit fills in `SnapshotLifecycleTask` to actually perform the snapshotting when the policy is triggered. Currently there is no handling of the results (other than logging) as that will be added in subsequent work. This also adds unit tests and an integration test that schedules a policy and ensures that a snapshot is correctly taken. Relates to #38461 * Record most recent snapshot policy success/failure (#40619) Keeping a record of the results of the successes and failures will aid troubleshooting of policies and make users more confident that their snapshots are being taken as expected. This is the first step toward writing history in a more permanent fashion. * Validate snapshot lifecycle policies (#40654) (This is a PR against the `snapshot-lifecycle-management` branch) With the commit, we now validate the content of snapshot lifecycle policies when the policy is being created or updated. This checks for the validity of the id, name, schedule, and repository. Additionally, cluster state is checked to ensure that the repository exists prior to the lifecycle being added to the cluster state. Part of #38461 * Hook SLM into ILM's start and stop APIs (#40871) (This pull request is for the `snapshot-lifecycle-management` branch) This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are cancelled. Relates to #38461 * Add tests for SnapshotLifecyclePolicyItem (#40912) Adds serialization tests for SnapshotLifecyclePolicyItem. * Fix improper import in build.gradle after master merge * Add human readable version of modified date for snapshot lifecycle policy (#41035) * Add human readable version of modified date for snapshot lifecycle policy This small change changes it from: ``` ... "modified_date": 1554843903242, ... ``` To ``` ... "modified_date" : "2019-04-09T21:05:03.242Z", "modified_date_millis" : 1554843903242, ... ``` Including the `"modified_date"` field when the `?human` field is used. Relates to #38461 * Fix test * Add API to execute SLM policy on demand (#41038) This commit adds the ability to perform a snapshot on demand for a policy. This can be useful to take a snapshot immediately prior to performing some sort of maintenance. ```json PUT /_ilm/snapshot/<policy>/_execute ``` And it returns the response with the generated snapshot name: ```json { "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug" } ``` Note that this does not allow waiting for the snapshot, and the snapshot could still fail. It does record this information into the cluster state similar to a regularly trigged SLM job. Relates to #38461 * Add next_execution to SLM policy metadata (#41221) * Add next_execution to SLM policy metadata This adds the next time a snapshot lifecycle policy will be executed when retriving a policy's metadata, for example: ```json GET /_ilm/snapshot?human { "production" : { "version" : 1, "modified_date" : "2019-04-15T21:16:21.865Z", "modified_date_millis" : 1555362981865, "policy" : { "name" : "<production-snap-{now/d}>", "schedule" : "/30 * * * ?", "repository" : "repo", "config" : { "indices" : [ "foo-", "important" ], "ignore_unavailable" : true, "include_global_state" : false } }, "next_execution" : "2019-04-15T21:16:30.000Z", "next_execution_millis" : 1555362990000 }, "other" : { "version" : 1, "modified_date" : "2019-04-15T21:12:19.959Z", "modified_date_millis" : 1555362739959, "policy" : { "name" : "<other-snap-{now/d}>", "schedule" : "0 30 2 * ?", "repository" : "repo", "config" : { "indices" : [ "other" ], "ignore_unavailable" : false, "include_global_state" : true } }, "next_execution" : "2019-04-16T02:30:00.000Z", "next_execution_millis" : 1555381800000 } } ``` Relates to #38461 * Fix and enhance tests * Figured out how to Cron * Change SLM endpoint from /_ilm/* to /_slm/* (#41320) This commit changes the endpoint for snapshot lifecycle management from: ``` GET /_ilm/snapshot/<policy> ``` to: ``` GET /_slm/policy/<policy> ``` It mimics the ILM path only using `slm` instead of `ilm`. Relates to #38461 * Add initial documentation for SLM (#41510) * Add initial documentation for SLM This adds the initial documentation for snapshot lifecycle management. It also includes the REST spec API json files since they're sort of documentation. Relates to #38461 * Add `manage_slm` and `read_slm` roles (#41607) * Add `manage_slm` and `read_slm` roles This adds two more built in roles - `manage_slm` which has permission to perform any of the SLM actions, as well as stopping, starting, and retrieving the operation status of ILM. `read_slm` which has permission to retrieve snapshot lifecycle policies as well as retrieving the operation status of ILM. Relates to #38461 * Add execute to the test * Fix ilm -> slm typo in test * Record SLM history into an index (#41707) It is useful to have a record of the actions that Snapshot Lifecycle Management takes, especially for the purposes of alerting when a snapshot fails or has not been taken successfully for a certain amount of time. This adds the infrastructure to record SLM actions into an index that can be queried at leisure, along with a lifecycle policy so that this history does not grow without bound. Additionally, SLM automatically setting up an index + lifecycle policy leads to `index_lifecycle` custom metadata in the cluster state, which some of the ML tests don't know how to deal with due to setting up custom `NamedXContentRegistry`s. Watcher would cause the same problem, but it is already disabled (for the same reason). * High Level Rest Client support for SLM (#41767) * High Level Rest Client support for SLM This commit add HLRC support for SLM. Relates to #38461 * Fill out documentation tests with tags * Add more callouts and asciidoc for HLRC * Update javadoc links to real locations * Add security test testing SLM cluster privileges (#42678) * Add security test testing SLM cluster privileges This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm` cluster privileges. Relates to #38461 * Don't redefine vars * Add Getting Started Guide for SLM (#42878) This commit adds a basic Getting Started Guide for SLM. * Include SLM policy name in Snapshot metadata (#43132) Keep track of which SLM policy in the metadata field of the Snapshots taken by SLM. This allows users to more easily understand where the snapshot came from, and will enable future SLM features such as retention policies. * Fix compilation after master merge * [TEST] Move exception wrapping for devious exception throwing Fixes an issue where an exception was created from one line and thrown in another. * Fix SLM for the change to AcknowledgedResponse * Add Snapshot Lifecycle Management Package Docs (#43535) * Fix compilation for transport actions now that task is required * Add a note mentioning the privileges needed for SLM (#43708) * Add a note mentioning the privileges needed for SLM This adds a note to the top of the "getting started with SLM" documentation mentioning that there are two built-in privileges to assist with creating roles for SLM users and administrators. Relates to #38461 * Mention that you can create snapshots for indices you can't read * Fix REST tests for new number of cluster privileges * Mute testThatNonExistingTemplatesAreAddedImmediately (#43951) * Fix SnapshotHistoryStoreTests after merge * Remove overridden newResponse functions that have been removed * Fix compilation for backport * Fix get snapshot output parsing in test * [DOCS] Add redirects for removed autogen anchors (#44380) * Switch <tt>...</tt> in javadocs for {@code ...}	2019-07-16 07:37:13 -06:00
Przemysław Witek	3f3a3d3f2b	[7.x] Add DatafeedTimingStats.average_search_time_per_bucket_ms and TimingStats.total_bucket_processing_time_ms stats (#44125 ) (#44404 )	2019-07-16 12:51:29 +02:00
Yannick Welsch	a848fc9bf4	Revert "Add usage stats for frozen indices (#44286 )" This reverts commit `5e73c49ec8`.	2019-07-15 21:41:25 +02:00
Yannick Welsch	5e73c49ec8	Add usage stats for frozen indices (#44286 ) Adds usage stats for frozen indices of the form: "frozen_indices" : { "available" : true, "enabled" : true, "indices_count" : 0 }	2019-07-15 17:34:46 +02:00
Benjamin Trent	79c62fd724	[ML][Data Frame] Fixing default delay set in timesync (#44281 ) (#44293 ) * [ML][Data Frame] Fixing default delay set in timesync * disallowing explicit null, don't do duration check on write	2019-07-12 15:21:47 -05:00
Mayya Sharipova	32cb47b91c	Add l1norm and l2norm distances for vectors (#44116 ) Add L1norm - Manhattan distance Add L2norm - Euclidean distance relates to #37947	2019-07-11 14:30:02 -04:00
Benjamin Trent	c82d9c5b50	[ML] Adds support for regression.mean_squared_error to eval API (#44140 ) (#44218 ) * [ML] Adds support for regression.mean_squared_error to eval API * addressing PR comments * fixing tests	2019-07-11 09:22:52 -05:00
Przemysław Witek	44781e415e	[7.x] [ML] Add DatafeedTimingStats to datafeed GetDatafeedStatsAction.Response (#43045 ) (#44118 )	2019-07-10 11:51:44 +02:00
David Roberts	cb62d4acdf	[ML-DataFrame] Add a frequency option to transform config, default 1m (#44120 ) Previously a data frame transform would check whether the source index was changed every 10 seconds. Sometimes it may be desirable for the check to be done less frequently. This commit increases the default to 60 seconds but also allows the frequency to be overridden by a setting in the data frame transform config.	2019-07-10 09:59:00 +01:00
Dimitris Athanasiou	d3ddedf9fc	[7.x][ML] Add missing doc links to df-analytics rest spec and HLRC javadocs (#44025 ) (#44033 )	2019-07-06 02:03:29 +03:00
Mayya Sharipova	37e1ad7062	Forbid empty doc values on vector functions (#43944 ) Currently when a document misses a vector value, vector function returns 0 as a score for this document. We think this is incorrect behaviour. With this change, an error will be thrown if vector functions are used with docs that are missing vector doc values. Also VectorScriptDocValues is modified to allow size() function, which can be used to check if a document has a value for the vector field.	2019-07-05 18:09:06 -04:00
Dimitris Athanasiou	30b20920b9	[7.x][ML] Report correct count for df-analytics get-stats API (#43969 ) (#43981 ) The count should match the number of all df-analytics that matched the id in the request. However, we set the count to the number of df-analytics returned which was bound to the `size` parameter. This commit fixes this by setting the count to the count of the `get` response.	2019-07-05 10:28:57 +03:00
Martijn van Groningen	1dd3d14f09	take into account `manage_enrich` builtin role	2019-07-04 16:51:48 +02:00
Martijn van Groningen	653f1436a0	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-07-04 13:05:10 +02:00
Dimitris Athanasiou	96b0b27f18	[7.x][ML] Set df-analytics task state to failed when appropriate (#43880 ) (#43906 ) This introduces a `failed` state to which the data frame analytics persistent task is set to when something unexpected fails. It could be the process crashing, the results processor hitting some error, etc. The failure message is then captured and set on the task state. From there, it becomes available via the _stats API as `failure_reason`. The df-analytics stop API now has a `force` boolean parameter. This allows the user to call it for a failed task in order to reset it to `stopped` after we have ensured the failure has been communicated to the user. This commit also adds the analytics version in the persistent task params as this allows us to prevent tasks to run on unsuitable nodes in the future.	2019-07-03 12:41:56 +03:00
Alexander Reelsen	9077c4402f	Watcher: Allow to execute actions for each element in array (#41997 ) This adds the ability to execute an action for each element that occurs in an array, for example you could sent a dedicated slack action for each search hit returned from a search. There is also a limit for the number of actions executed, which is hardcoded to 100 right now, to prevent having watches run forever. The watch history logs each action result and the total number of actions the were executed. Relates #34546	2019-07-03 11:28:50 +02:00
Tim Vernum	2a8f30eb9a	Support builtin privileges in get privileges API (#43901 ) Adds a new "/_security/privilege/_builtin" endpoint so that builtin index and cluster privileges can be retrieved via the Rest API Backport of: #42134	2019-07-03 19:08:28 +10:00
Mayya Sharipova	756c42f99f	Add dims parameter to dense_vector mapping (#43444 ) (#43895 ) Typically, dense vectors of both documents and queries must have the same number of dimensions. Different number of dimensions among documents or query vector indicate an error. This PR enforces that all vectors for the same field have the same number of dimensions. It also enforces that query vectors have the same number of dimensions.	2019-07-02 21:14:16 -04:00
Benjamin Trent	fb825a6470	[7.x] [ML][Data Frame] add node attr to GET _stats (#43842 ) (#43894 ) * [ML][Data Frame] add node attr to GET _stats (#43842) * [ML][Data Frame] add node attr to GET _stats * addressing testing issues with node.attributes * adjusting for backport	2019-07-02 19:35:37 -05:00
Tim Vernum	8d099dad38	Add "manage_api_key" cluster privilege (#43865 ) This adds a new cluster privilege for manage_api_key. Users with this privilege are able to create new API keys (as a child of their own user identity) and may also get and invalidate any/all API keys (including those owned by other users). Backport of: #43728	2019-07-02 21:57:42 +10:00
Benjamin Trent	82c1ddc117	[7.x] [ML][Data Frame] Add deduced mappings to _preview response payload (#43742 ) (#43849 ) * [ML][Data Frame] Add deduced mappings to _preview response payload (#43742) * [ML][Data Frame] Add deduced mappings to _preview response payload * updating preview docs * fixing code for backport	2019-07-02 06:52:14 -05:00
Tanguy Leroux	b977f019b8	Expose translog stats in ReadOnlyEngine (#43752 ) (#43823 ) Backport of #43752 for 7.x.	2019-07-02 13:39:00 +02:00
Tomas Della Vedova	4cdb24bceb	Use explicit string keys in data_frame test (#43854 )	2019-07-02 11:06:29 +02:00
Julie Tibshirani	ffa5919d7c	Add support for 'flattened object' fields. (#43762 ) This commit merges the `object-fields` feature branch. The new 'flattened object' field type allows an entire JSON object to be indexed into a field, and provides limited search functionality over the field's contents.	2019-07-01 12:08:50 +03:00
Hendrik Muhs	a58d231f4d	relax trigger count for transform stats test (#43753 ) relax trigger count test as we can not guarantee it due to async behaviour	2019-07-01 10:30:40 +02:00
Martijn van Groningen	eb8e03bc8b	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-06-30 21:32:51 +02:00
Dimitris Athanasiou	86c853a7c2	[7.x][ML] Rename outlier score setting to feature_influence_threshold (#43705 ) (#43734 ) Renames outlier score setting `minimum_score_to_write_feature_influence` to `feature_influence_threshold`.	2019-06-28 13:28:25 +03:00
Dimitris Athanasiou	cab879118d	[7.x][ML] Support multiple source indices for df-analytics (#43702 ) (#43731 ) This commit adds support for multiple source indices. In order to deal with multiple indices having different mappings, it attempts a best-effort approach to merge the mappings assuming there are no conflicts. In case conflicts exists an error will be returned. To allow users creating custom mappings for special use cases, the destination index is now allowed to exist before the analytics job runs. In addition, settings are no longer copied except for the `index.number_of_shards` and `index.number_of_replicas`.	2019-06-28 13:28:03 +03:00
Christoph Büscher	2cc7f5a744	Allow reloading of search time analyzers (#43313 ) Currently changing resources (like dictionaries, synonym files etc...) of search time analyzers is only possible by closing an index, changing the underlying resource (e.g. synonym files) and then re-opening the index for the change to take effect. This PR adds a new API endpoint that allows triggering reloading of certain analysis resources (currently token filters) that will then pick up changes in underlying file resources. To achieve this we introduce a new type of custom analyzer (ReloadableCustomAnalyzer) that uses a ReuseStrategy that allows swapping out analysis components. Custom analyzers that contain filters that are markes as "updateable" will automatically choose this implementation. This PR also adds this capability to `synonym` token filters for use in search time analyzers. Relates to #29051	2019-06-28 09:55:40 +02:00
Przemysław Witek	94f18da5df	Add version and create_time to data frame analytics config (#43683 ) (#43712 )	2019-06-28 07:37:21 +02:00
David Roberts	f39619d182	[ML] Don't write timing stats on no-op (#43680 ) Similar to elastic/ml-cpp#512, if a job opens and closes and does nothing in between we shouldn't write timing stats to the results index.	2019-06-27 16:37:54 +01:00
Martijn van Groningen	683e116601	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-06-27 08:35:37 +02:00
Benjamin Trent	d05593c3ad	[ML][Data Frame] adds tests for continuous DF (#43601 ) (#43654 )	2019-06-26 14:59:19 -05:00
Benjamin Trent	52e26bbc42	[ML][Data Frame] improve pivot nested field validations (#43548 ) (#43636 ) * [ML][Data Frame] improve pivot nested field validations * addressing pr comments	2019-06-26 13:35:51 -05:00
Benjamin Trent	c121b00c98	[7.x] [ML][Data Frame] Add support for allow_no_match for endpoints (#43490 ) (#43637 ) * [ML][Data Frame] Add support for allow_no_match for endpoints (#43490) * [ML][Data Frame] Add support for allow_no_match parameter in endpoints Adds support for: * Get Transforms * Get Transforms stats * stop transforms * Update DataFrameTransformDocumentationIT.java	2019-06-26 10:09:56 -05:00
Yannick Welsch	2049f715b3	Add voting-only master node (#43410 ) A voting-only master-eligible node is a node that can participate in master elections but will not act as a master in the cluster. In particular, a voting-only node can help elect another master-eligible node as master, and can serve as a tiebreaker in elections. High availability (HA) clusters require at least three master-eligible nodes, so that if one of the three nodes is down, then the remaining two can still elect a master amongst them-selves. This only requires one of the two remaining nodes to have the capability to act as master, but both need to have voting powers. This means that one of the three master-eligible nodes can be made as voting-only. If this voting-only node is a dedicated master, a less powerful machine or a smaller heap-size can be chosen for this node. Alternatively, a voting-only non-dedicated master node can play the role of the third master-eligible node, which allows running an HA cluster with only two dedicated master nodes. Closes #14340 Co-authored-by: David Turner <david.turner@elastic.co>	2019-06-26 08:07:56 +02:00
Tanguy Leroux	0dc1c12f13	Fix indices shown in _cat/indices (#43286 ) After two recent changes (#38824 and #33888), the _cat/indices API no longer report information for active recovering indices and non-replicated closed indices. It also misreport replicated closed indices that are potentially not authorized for the user. This commit changes how the cat action works by first using the Get Settings API in order to resolve authorized indices. It then uses the Cluster State, Cluster Health and Indices Stats APIs to retrieve information about the indices. Closes #39933	2019-06-25 20:02:34 +02:00
Dimitris Athanasiou	126c2fd2d5	[7.x][ML] Machine learning data frame analytics (#43544 ) (#43592 ) This merges the initial work that adds a framework for performing machine learning analytics on data frames. The feature is currently experimental and requires a platinum license. Note that the original commits can be found in the `feature-ml-data-frame-analytics` branch. A new set of APIs is added which allows the creation of data frame analytics jobs. Configuration allows specifying different types of analysis to be performed on a data frame. At first there is support for outlier detection. The APIs are: - PUT _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id}/_stats - POST _ml/data_frame/analysis/{id}/_start - POST _ml/data_frame/analysis/{id}/_stop - DELETE _ml/data_frame/analysis/{id} When a data frame analytics job is started a persistent task is created and started. The main steps of the task are: 1. reindex the source index into the dest index 2. analyze the data through the data_frame_analyzer c++ process 3. merge the results of the process back into the destination index In addition, an evaluation API is added which packages commonly used metrics that provide evaluation of various analysis: - POST _ml/data_frame/_evaluate	2019-06-25 20:29:11 +03:00
Martijn van Groningen	df9f06213d	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-06-21 19:58:04 +02:00
Benjamin Trent	f4b75d6d14	[7.x] [ML][Data Frame] Add version and create_time to transform config (#43384 ) (#43480 ) * [ML][Data Frame] Add version and create_time to transform config (#43384) * [ML][Data Frame] Add version and create_time to transform config * s/transform_version/version s/Date/Instant * fixing getter/setter for version * adjusting for backport	2019-06-21 09:11:44 -05:00
Martijn van Groningen	9de4e878f7	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-06-20 09:44:31 +02:00
Benjamin Trent	77ce3260dd	[ML][Data Frame] make response.count be total count of hits (#43241 ) (#43389 ) * [ML][Data Frame] make response.count be total count of hits * addressing line length check * changing response count for filters * adjusting serialization, variable name, and total count logic * making count mandatory for creation	2019-06-19 16:19:06 -05:00
Benjamin Trent	b333ced5a7	[7.x] [ML][Data Frame] adds new pipeline field to dest config (#43124 ) (#43388 ) * [ML][Data Frame] adds new pipeline field to dest config (#43124) * [ML][Data Frame] adds new pipeline field to dest config * Adding pipeline support to _preview * removing unused import * moving towards extracting _source from pipeline simulation * fixing permission requirement, adding _index entry to doc * adjusting for java 8 compatibility * adjusting bwc serialization version to 7.3.0	2019-06-19 16:18:27 -05:00
Mayya Sharipova	aa6248d4d7	Move dense_vector and sparse_vector to module (#43280 ) (#43333 )	2019-06-18 11:56:04 -04:00
Martijn Laarman	8b1b9f8ab9	Introduce stability description to the REST API specification (#38413 ) (#43278 ) * introduce state to the REST API specification * change state over to stability * CCR is no GA updated to stable * SQL is now GA so marked as stable * Introduce `internal` as state for API's, marks stable in terms of lifetime but unstable in terms of guarantees on its output format since it exposes internal representations * make setting a wrong stability value, or not setting it at all an error that causes the YAML test suite to fail * update spec files to be explicit about their stability state * Document the fact that stability needs to be defined Otherwise the YAML test runner will fail (with a nice exception message) * address check style violations * update rest spec unit tests to include stability * found one more test spec file not declaring stability, made sure stability appears after documentation everywhere * cluster.state is stable, mark response in some way to denote its a key value format that can be changed during minors * mark data frame API's as beta * remove internal and private as states for an API * removed the wrong enum values in the Stability Enum in the previous commit (cherry picked from commit 61c34bbd92f8f7e5f22fa411c6b682b0ebd8a99d)	2019-06-17 16:57:13 +02:00
Przemysław Witek	b2613a123d	[7.x] Report exponential_avg_bucket_processing_time which gives more weight to recent buckets (#43189 ) (#43263 )	2019-06-17 08:58:26 +02:00
Przemysław Witek	65a584b6fb	[7.x] Report timing stats as part of the Job stats response (#42709 ) (#43193 )	2019-06-14 09:03:14 +02:00
Martijn van Groningen	c8e6474eef	Changes required for merging in 7.x branch.	2019-06-13 16:58:27 +02:00
Martijn van Groningen	1f3db7eb3e	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-06-13 16:49:38 +02:00
Luca Cavanna	afeda1a7b9	Split search in two when made against throttled and non throttled searches (#42510 ) When a search on some indices takes a long time, it may cause problems to other indices that are being searched as part of the same search request and being written to as well, because their search context needs to stay open for a long time. This is especially a problem when searching against throttled and non-throttled indices as part of the same request. The problem can be generalized though: this may happen whenever read-only indices are searched together with indices that are being written to. Search contexts staying open for a long time is only an issue for indices that are being written to, in practice. This commit splits the search in two sub-searches: one for read-only indices, and one for ordinary indices. This way the two don't interfere with each other. The split is done only when size is greater than 0, no scroll is provided and query_then_fetch is used as search type. Otherwise, the search executes like before. Note that the returned num_reduce_phases reflect the number of reduction phases that were run. If the search is split in two, there are three reductions: one non-final for each search, and a final one that merges the results of the previous two. Closes #40900	2019-06-12 11:25:03 +02:00
Benjamin Trent	7ff3d86cf0	[ML][Data Frame] adding dest.index and id validations (#43053 ) (#43109 ) * [ML][Data Frame] adding dest.index and id validations * adjusting message format * Adjusting id validity pattern * Update DataFrameStrings.java	2019-06-11 15:55:18 -05:00
Benjamin Trent	e384bf0276	[ML-DataFrame] stop task at completion of data frame function (#42955 ) (#43114 ) * stop data frame task after it finishes * test auto stop * adapt tests * persist the state correctly and move stop into listener * Calling `onStop` even if persistence fails, changing `stop` to rely on doSaveState	2019-06-11 15:55:02 -05:00
Ryan Ernst	172cd4dbfa	Remove description from xpack feature sets (#43065 ) The description field of xpack featuresets is optionally part of the xpack info api, when using the verbose flag. However, this information is unnecessary, as it is better left for documentation (and the existing descriptions describe anything meaningful). This commit removes the description field from feature sets.	2019-06-11 09:22:58 -07:00
Martijn Laarman	cb7ce865b7	remove path from rest-api-spec (#41452 ) (#43084 ) (cherry picked from commit f5fde1d0843d2f0f53d3b9a15b9cfc8b94471ab7)	2019-06-11 12:52:36 +02:00
Dimitris Athanasiou	76a92b49a8	[ML] Get resources action should be lenient when sort field is unmapped (#42991 ) (#43046 ) Get resources action sorts on the resource id. When there are no resources at all, then it is possible the index does not contain a mapping for the resource id field. In that case, the search api fails by default. This commit adjusts the search request to ignore unmapped fields. Closes elastic/kibana#37870	2019-06-10 19:50:19 +03:00
David Roberts	b202a59f88	[ML] Add earliest and latest timestamps to field stats (#42890 ) This change adds the earliest and latest timestamps into the field stats for fields of type "date" in the output of the ML find_file_structure endpoint. This will enable the cards for date fields in the file data visualizer in the UI to be made to look more similar to the cards for date fields in the index data visualizer in the UI.	2019-06-06 08:58:35 +01:00
Benjamin Trent	293f306b9a	[ML][Data Frame] forcing that no ptask => STOPPED state (#42800 ) (#42860 ) * [ML][Data Frame] forcing that no ptask => STOPPED state * Addressing side-effect, early exit for stop when stopped	2019-06-05 07:09:34 -05:00
David Roberts	b61202b0a8	[ML] Add a limit on line merging in find_file_structure (#42501 ) When analysing a semi-structured text file the find_file_structure endpoint merges lines to form multi-line messages using the assumption that the first line in each message contains the timestamp. However, if the timestamp is misdetected then this can lead to excessive numbers of lines being merged to form massive messages. This commit adds a line_merge_size_limit setting (default 10000 characters) that halts the analysis if a message bigger than this is created. This prevents significant CPU time being spent subsequently trying to determine the internal structure of the huge bogus messages.	2019-06-03 13:45:51 +01:00
Benjamin Trent	0253927ec4	[ML Data Frame] Refactor stop logic (#42644 ) (#42763 ) * Revert "invalid test" This reverts commit 9dd8b52c13c716918ff97e6527aaf43aefc4695d. * Testing * mend * Revert "[ML Data Frame] Mute Data Frame tests" This reverts commit 5d837fa312b0e41a77a65462667a2d92d1114567. * Call onStop and onAbort outside atomic update * Don’t update CS * Tidying up * Remove invalid test that asserted logic that has been removed * Add stopped event * Revert "Add stopped event" This reverts commit 02ba992f4818bebd838e1c7678bd2e1cc090bfab. * Adding check for STOPPED in saveState	2019-06-03 06:53:44 -05:00
Przemysław Witek	f6779de2b7	Increase maximum forecast interval to 10 years. (#41082 ) (#42710 ) Increase the maximum duration to ~10 years (3650 days).	2019-05-31 06:19:47 +02:00
James Baiera	215170b6c3	Merge branch '7.x' into enrich-7.x	2019-05-30 16:13:06 -04:00
David Kyle	c5a410f68b	[ML Data Frame] Set DF task state when stopping (#42516 ) Set the state to stopped prior to persisting	2019-05-29 16:39:44 +01:00
Hendrik Muhs	345ff21ae5	[ML-DataFrame] rewrite start and stop to answer with acknowledged (#42589 ) rewrite start and stop to answer with acknowledged fixes #42450	2019-05-29 11:14:32 +02:00
Michael Basnight	77eed9e6a0	Add enrich policy GET API (#41384 ) This commit wires up the Rest calls and Transport calls for GET enrich policy, as well as tests and rest spec additions.	2019-05-28 23:19:23 -05:00
Michael Basnight	be60125a4e	Merge branch '7.x' into enrich-7.x	2019-05-28 18:32:18 -05:00
David Kyle	aea600fe7d	[Ml Data Frame] Return bad_request on preview when config is invalid (#42447 )	2019-05-28 15:36:50 +01:00
Martijn van Groningen	a91cec4c46	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-05-27 10:18:02 +02:00
Hendrik Muhs	6d47ee9268	[ML-DataFrame] add support for fixed_interval, calendar_interval, remove interval (#42427 ) * add support for fixed_interval, calendar_interval, remove interval * adapt HLRC * checkstyle * add a hlrc to server test * adapt yml test * improve naming and doc * improve interface and add test code for hlrc to server * address review comments * repair merge conflict * fix date patterns * address review comments * remove assert for warning * improve exception message * use constants	2019-05-24 20:30:17 +02:00
Michael Basnight	2325ffb757	Add enrich policy execute API (#41762 ) This commit wires up the Rest calls and Transport calls for execute enrich policy, as well as tests and rest spec additions.	2019-05-24 09:39:29 -05:00
Martijn van Groningen	79fa7d8098	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-05-24 11:44:35 +02:00
David Kyle	a23257ce06	[ML Data Frame] Account for completed data frames in test (#42351 ) When asserting on the checkpoint value if the DF has completed the checkpoint will be 1 else 0. Similarly state may be started or indexing. Closes #42309	2019-05-23 14:05:09 +01:00
Michael Basnight	323251c3d1	Merge branch '7.x' into enrich-7.x	2019-05-21 16:51:42 -05:00
David Kyle	7e4d3c695b	[ML Data Frame] Persist and restore checkpoint and position (#41942 ) Persist and restore Data frame's current checkpoint and position	2019-05-21 18:57:13 +01:00
David Kyle	0fd42ce1f5	[ML Data Frame] Start directly data frame rather than via the scheduler (#42224 ) Trigger indexer start directly to put the indexer in INDEXING state immediately	2019-05-21 15:48:45 +01:00
David Kyle	24144aead2	[ML] Complete the Data Frame task on stop (#41752 ) (#42063 ) Wait for indexer to stop then complete the persistent task on stop. If the wait_for_completion is true the request will not return until stopped.	2019-05-21 10:24:20 +01:00
Zachary Tong	6ae6f57d39	[7.x Backport] Force selection of calendar or fixed intervals (#41906 ) The date_histogram accepts an interval which can be either a calendar interval (DST-aware, leap seconds, arbitrary length of months, etc) or fixed interval (strict multiples of SI units). Unfortunately this is inferred by first trying to parse as a calendar interval, then falling back to fixed if that fails. This leads to confusing arrangement where `1d` == calendar, but `2d` == fixed. And if you want a day of fixed time, you have to specify `24h` (e.g. the next smallest unit). This arrangement is very error-prone for users. This PR adds `calendar_interval` and `fixed_interval` parameters to any code that uses intervals (date_histogram, rollup, composite, datafeed, etc). Calendar only accepts calendar intervals, fixed accepts any combination of units (meaning `1d` can be used to specify `24h` in fixed time), and both are mutually exclusive. The old interval behavior is deprecated and will throw a deprecation warning. It is also mutually exclusive with the two new parameters. In the future the old dual-purpose interval will be removed. The change applies to both REST and java clients.	2019-05-20 12:07:29 -04:00
Martijn van Groningen	855f5cc6a5	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-05-20 12:16:57 +02:00
Benjamin Trent	febee07dcc	[ML] adding pivot.max_search_page_size option for setting paging size (#41920 ) (#42079 ) * [ML] adding pivot.size option for setting paging size * Changing field name to address PR comments * fixing ctor usage * adjust hlrc for field name change	2019-05-10 13:22:31 -05:00
Benjamin Trent	0931815355	[ML] properly nesting objects in document source (#41901 ) (#42077 ) * [ML] properly nesting objects in document source * Throw exception on agg extraction failure, cause it to fail df * throwing error to stop df if unsupported agg is found	2019-05-10 13:22:12 -05:00
Benjamin Trent	b23b06dded	[ML] verify that there are no duplicate leaf fields in aggs (#41895 ) (#42025 ) * [ML] verify that there are no duplicate leaf fields in aggs * addressing pr comments * addressing PR comments * optmizing duplication check	2019-05-09 14:29:10 -05:00
Michael Basnight	202a840da9	Merge remote-tracking branch 'upstream/7.x' into enrich-7.x	2019-05-08 13:59:01 -05:00
Zachary Tong	f410f91f13	Cleanup RollupSearch exceptions, disallow partial results (#41272 ) - msearch exceptions should be thrown directly instead of wrapping in a RuntimeException - Do not allow partial results (where some indices are missing), instead throw an exception if any index is missing	2019-05-08 12:38:42 -04:00
Benjamin Trent	50fc27e9a0	[ML] addresses preview bug, and adds check to PUT (#41803 ) (#41850 )	2019-05-06 10:56:26 -05:00
Michael Basnight	5d53706310	Add enrich policy DELETE API (#41495 ) This commit wires up the Rest calls and Transport calls for DELETE enrich policy, as well as tests and rest spec additions.	2019-05-02 11:02:49 -05:00
Michael Basnight	2978ac3061	Add enrich policy list API (#41553 ) This commit wires up the Rest calls and Transport calls for listing all enrich policies, as well as tests and rest spec additions.	2019-05-02 11:01:26 -05:00
Martijn van Groningen	e429cd7f28	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-05-02 09:40:25 +02:00
Jason Tedor	7f3ab4524f	Bump 7.x branch to version 7.2.0 This commit adds the 7.2.0 version constant to the 7.x branch, and bumps BWC logic accordingly.	2019-05-01 13:38:57 -04:00
Albert Zaharovits	990be1f806	Security Tokens moved to a new separate index (#40742 ) This commit introduces the `.security-tokens` and `.security-tokens-7` alias-index pair. Because index snapshotting is at the index level granularity (ie you cannot snapshot a subset of an index) snapshoting .`security` had the undesirable effect of storing ephemeral security tokens. The changes herein address this issue by moving tokens "seamlessly" (without user intervention) to another index, so that a "Security Backup" (ie snapshot of `.security`) would not be bloated by ephemeral data.	2019-05-01 14:53:56 +03:00
Martijn van Groningen	eb9618f1b7	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-04-29 09:21:04 +02:00
Benjamin Trent	a0990ca239	[ML] cleanup + adding description field to transforms (#41554 ) (#41605 ) * [ML] cleanup + adding description field to transforms * making description length have a max of 1k	2019-04-26 16:50:59 -05:00
Martijn van Groningen	6af17e4bdf	Add enrich qa module for rest tests and (#41568 ) move put policy api yaml test to this rest module. The main benefit is that all tests will then be run when running: `./gradlew -p x-pack/plugin/enrich check` The rest qa module starts a node with default distribution and basic license. This qa module will also be used for adding different rest tests (not yaml), for example rest tests needed for #41532 Also when we are going to work on security integration then we can add a security qa module under the qa folder. Also at some point we should add a multi node qa module.	2019-04-26 20:20:02 +02:00
Michael Basnight	fad45ea6bd	Add enrich policy PUT API (#41383 ) This commit wires up the Rest calls and Transport calls for PUT enrich policy, as well as tests and rest spec additions.	2019-04-25 15:15:25 -05:00
Benjamin Trent	08843ba62b	[ML] Adds progress reporting for transforms (#41278 ) (#41529 ) * [ML] Adds progress reporting for transforms * fixing after master merge * Addressing PR comments * removing unused imports * Adjusting afterKey handling and percentage to be 100* * Making sure it is a linked hashmap for serialization * removing unused import * addressing PR comments * removing unused import * simplifying code, only storing total docs and decrementing * adjusting for rewrite * removing initial progress gathering from executor	2019-04-25 11:23:12 -05:00
Michael Basnight	38e6dcd388	Merge remote-tracking branch 'upstream/7.x' into enrich-7.x	2019-04-23 20:38:28 -05:00
Benjamin Trent	e2f8ffdde8	[ML][Data Frame] Moving destination creation to _start (#41416 ) (#41433 ) * [ML][Data Frame] Moving destination creation to _start * slight refactor of DataFrameAuditor constructor	2019-04-23 09:32:57 -05:00
Martijn Laarman	85b9dc18a7	fix #35262 define deprecations of API's as a whole and urls (#39063 ) * fix #35262 define deprecations of API's as a whole and urls * document hot threads deprecated paths * deprecate scroll_id as part of the URL, documented only as part of the body which is a safer behaviour as well * use version numbers up to patch version * rest spec parser picks up deprecated paths as paths too (cherry picked from commit 7e06023e7603b7584bfd9ee4e8a1ccd82c208ce7)	2019-04-23 14:28:36 +02:00
Michael Basnight	860e783f14	Merge remote-tracking branch 'upstream/7.x' into enrich-7.x	2019-04-22 09:39:28 -05:00
Zachary Tong	7e62ff2823	[Rollup] Validate timezones based on rules not string comparision (#36237 ) The date_histogram internally converts obsolete timezones (such as "Canada/Mountain") into their modern equivalent ("America/Edmonton"). But rollup just stored the TZ as provided by the user. When checking the TZ for query validation we used a string comparison, which would fail due to the date_histo's upgrading behavior. Instead, we should convert both to a TimeZone object and check if their rules are compatible.	2019-04-17 13:46:44 -04:00
Yogesh Gaikwad	6a552c05fe	Use alias name from rollover request to query indices stats (#40774 ) (#41284 ) In `TransportRolloverAction` before doing rollover we resolve source index name (write index) from the alias in the rollover request. Before evaluating the conditions and executing rollover action, we retrieve stats, but to do so we used the source index name resolved from the alias instead of alias from the index. This fails when the user is assigned a role with index privilege on the alias instead of the concrete index. This commit fixes this by using the alias from the request. After this change, verified that when we retrieve all the stats (including write + read indexes) we are considering only source index. Closes #40771	2019-04-17 14:15:05 +10:00
David Kyle	2b539f8347	[ML DataFrame] Data Frame stop all (#41156 ) Wild card support for the data frame stop API	2019-04-15 15:04:28 +01:00
Nik Everett	c379206c1e	Fix some documentation urls in rest-api-spec (#40618 ) (#41145 ) Fixes some documentation urls in the rest-api-spec. Some of these URLs pointed to 404s and a few others pointed to deprecated documentation when we have better documentation now. I'm not consistent about `master` vs `current` because we're not consistent in other places and I think we should solve all of those at once with something a little more automatic.	2019-04-12 10:11:14 -04:00
Martijn van Groningen	b66ad34565	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-04-12 08:24:59 +02:00
Przemysław Witek	f5014ace64	[ML] Add validation that rejects duplicate detectors in PutJobAction (#40967 ) (#41072 ) * [ML] Add validation that rejects duplicate detectors in PutJobAction Closes #39704 * Add YML integration test for duplicate detectors fix. * Use "== false" comparison rather than "!" operator. * Refine error message to sound more natural. * Put job description in square brackets in the error message. * Use the new validation in ValidateJobConfigAction. * Exclude YML tests for new validation from permission tests.	2019-04-10 15:43:35 +02:00
Hendrik Muhs	f9018ab11b	[ML-DataFrame] create checkpoints on every new run (#40725 ) Use the checkpoint service to create a checkpoint on every new run. Expose checkpoints stats on _stats endpoint.	2019-04-10 09:14:11 +02:00
Martijn van Groningen	5a1d5cca4f	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-04-09 09:59:24 +02:00
Benjamin Trent	665f0d81aa	[ML] refactoring start task a bit, removing unused code (#40798 ) (#40845 )	2019-04-05 09:01:01 -05:00
Martijn van Groningen	e6bdfea474	first commit	2019-04-05 09:32:40 +02:00
Benjamin Trent	945e7ca01e	[ML] Periodically persist data-frame running statistics to internal index (#40650 ) (#40729 ) * [ML] Add mappings, serialization, and hooks to persist stats * Adding tests for transforms without tasks having stats persisted * intermittent commit * Adjusting usage stats to account for stored stats docs * Adding tests for id expander * Addressing PR comments * removing unused import * adding shard failures to the task response	2019-04-02 14:16:55 -05:00
Tim Vernum	2c770ba3cb	Support mustache templates in role mappings (#40571 ) This adds a new `role_templates` field to role mappings that is an alternative to the existing roles field. These templates are evaluated at runtime to determine which roles should be granted to a user. For example, it is possible to specify: "role_templates": [ { "template":{ "source": "_user_{{username}}" } } ] which would mean that every user is assigned to their own role based on their username. You may not specify both roles and role_templates in the same role mapping. This commit adds support for templates to the role mapping API, the role mapping engine, the Java high level rest client, and Elasticsearch documentation. Due to the lack of caching in our role mapping store, it is currently inefficient to use a large number of templated role mappings. This will be addressed in a future change. Backport of: #39984, #40504	2019-04-02 20:55:10 +11:00
Tim Vernum	7bdd41399d	Support roles with application privileges against wildcard applications (#40675 ) This commit introduces 2 changes to application privileges: - The validation rules now accept a wildcard in the "suffix" of an application name. Wildcards were always accepted in the application name, but the "valid filename" check for the suffix incorrectly prevented the use of wildcards there. - A role may now be defined against a wildcard application (e.g. kibana-*) and this will be correctly treated as granting the named privileges against all named applications. This does not allow wildcard application names in the body of a "has-privileges" check, but the "has-privileges" check can test concrete application names against roles with wildcards. Backport of: #40398	2019-04-02 14:48:39 +11:00
Benjamin Trent	12943c5d2c	[ML] Add data frame task state object and field (#40169 ) (#40490 ) * [ML] Add data frame task state object and field * A new state item is added so that the overall task state can be accoutned for * A new FAILED state and reason have been added as well so that failures can be shown to the user for optional correction * Addressing PR comments * adjusting after master merge * addressing pr comment * Adjusting auditor usage with failure state * Refactor, renamed state items to task_state and indexer_state * Adding todo and removing redundant auditor call * Address HLRC changes and PR comment * adjusting hlrc IT test	2019-03-27 06:53:58 -05:00
Benjamin Trent	7b4f964708	[ML] make source and dest objects in the transform config (#40337 ) (#40396 ) * [ML] make source and dest objects in the transform config * addressing PR comments * Fixing compilation post merge * adding comment for Arrays.hashCode * addressing changes for moving dest to object * fixing data_frame yml tests * fixing API test	2019-03-25 07:16:41 -05:00
Benjamin Trent	2dd879abac	[ML] adds support for non-numeric mapped types (#40220 ) (#40380 ) * [ML] adds support for non-numeric mapped types and mapping overrides * correcting hlrc compilation issues after merge * removing mapping_override option * clearing up unnecessary changes	2019-03-23 14:04:14 -05:00
Lisa Cawley	e6799849d1	[DOCS] Adds placeholder for start and stop data frame transform APIs (#40278 )	2019-03-21 09:39:10 -07:00
Lisa Cawley	caa0129d44	[DOCS] Adds placeholder for create and delete data frame transform APIs (#40233 )	2019-03-21 09:13:50 -07:00
lcawl	0e712d476e	Adds URL for preview data frame transforms	2019-03-21 08:28:23 -07:00
Lisa Cawley	ff2bcc9d11	[DOCS] Adds placeholder for get data frame transform APIs (#40283 )	2019-03-21 07:57:01 -07:00
Yogesh Gaikwad	5d30df5a60	Fix so non super users can also create API keys (#40028 ) (#40286 ) When creating API keys we check for if API key with the same key name already exists and fail the request if it does. The check should have been performed with XPackSecurityUser instead of the authenticated user. This caused the request to fail in case of the non-super user trying to create an API key. This commit fixes by executing search action with SECURITY_ORIGIN so it can be executed with XPackSecurityUser. Also fixed the Rest test to avoid using a user with `super_user` role. Closes #40029	2019-03-21 15:53:25 +11:00
Benjamin Trent	5ae43855fc	[ML] Refactor GET Transforms API (#40015 ) (#40269 ) * [Data Frame] Refactor GET Transforms API: * Add pagination * comma delimited list expression support GET transforms * Flag troublesome internal code for future refactor * Removing `allow_no_transforms` param, ratcheting down pageparam option * Changing DataFrameFeatureSet#usage to not get all configs * Intermediate commit * Writing test for batch data gatherer * Removing unused import * removing bad println used for debugging * Updating BatchedDataIterator comments and query * addressing pr comments * disallow null scrollId to cause stackoverflow	2019-03-20 19:14:50 -05:00
David Kyle	387648065d	[ML] Data Frame HLRC start & stop APIs (#40197 )	2019-03-19 13:30:01 +00:00
Gordon Brown	c8a4a7fc9d	Remove Migration Upgrade and Assistance APIs (#40075 ) The Migration Assistance API has been functionally replaced by the Deprecation Info API, and the Migration Upgrade API is not used for the transition from ES 6.x to 7.x, and does not need to be kept around to repair indices that were not properly upgraded before upgrading the cluster, as was the case in 6.	2019-03-18 13:46:56 -06:00
Benjamin Trent	2016e23285	[ML] Refactor common utils out of ML plugin to XPack.Core (#39976 ) (#40009 ) * [ML] Refactor common utils out of ML plugin to XPack.Core * implementing GET filters with abstract transport * removing added rest param * adjusting how defaults can be supplied	2019-03-13 17:08:43 -05:00
Benjamin Trent	8c6ff5de31	[Data Frame] Refactor PUT transform to not create a task (#39934 ) (#40010 ) * [Data Frame] Refactor PUT transform such that: * POST _start creates the task and starts it * GET transforms queries docs instead of tasks * POST _stop verifies the stored config exists before trying to stop the task * Addressing PR comments * Refactoring DataFrameFeatureSet#usage, decreasing size returned getTransformConfigurations * fixing failing usage test	2019-03-13 17:08:15 -05:00
David Kyle	48788269b0	[ML] Correct small inconsistencies in ml APIs spec and docs (#39907 )	2019-03-11 14:02:50 +00:00
Benjamin Trent	6c6549fc51	[Data-Frame] make the config be strictly parsed on _preview (#39713 ) (#39873 ) * [Data-Frame] make the config be strictly parsed on _preview * adding test to verify strictly parsing * adjusting test after master merge	2019-03-09 14:03:57 -06:00
Jason Tedor	0250d554b6	Introduce forget follower API (#39718 ) This commit introduces the forget follower API. This API is needed in cases that unfollowing a following index fails to remove the shard history retention leases on the leader index. This can happen explicitly through user action, or implicitly through an index managed by ILM. When this occurs, history will be retained longer than necessary. While the retention lease will eventually expire, it can be expensive to allow history to persist for that long, and also prevent ILM from performing actions like shrink on the leader index. As such, we introduce an API to allow for manual removal of the shard history retention leases in this case.	2019-03-07 11:08:45 -05:00
Yogesh Gaikwad	c91dcbd5ee	Types removal security index template (#39705 ) (#39728 ) As we are moving to single type indices, we need to address this change in security-related indexes. To address this, we are - updating index templates to use preferred type name `_doc` - updating the API calls to use preferred type name `_doc` Upgrade impact:- In case of an upgrade from 6.x, the security index has type `doc` and this will keep working as there is a single type and `_doc` works as an alias to an existing type. The change is handled in the `SecurityIndexManager` when we load mappings and settings from the template. Previously, we used to do a `PutIndexTemplateRequest` with the mapping source JSON with the type name. This has been modified to remove the type name from the source. So in the case of an upgrade, the `doc` type is updated whereas for fresh installs `_doc` is updated. This happens as backend handles `_doc` as an alias to the existing type name. An optional step is to `reindex` security index and update the type to `_doc`. Since we do not support the security audit log index, that template has been deleted. Relates: #38637	2019-03-06 18:53:59 +11:00
Tomas Della Vedova	fad52acf5a	Removed incorrect ML YAML tests (#39400 ) A client cannot know that a job_id is already taken, so this test should not have been specified as a client test	2019-03-05 17:13:10 +00:00
Martijn Laarman	52ecf18dc4	Index on rollup.rollup_search.json is a list (#39097 ) (#39653 ) And not a string since it accepts comma separated list of indices. (cherry picked from commit cf34d50b3a983b5fc0c9c7aa279cecd4aa10e28b)	2019-03-04 15:23:18 +01:00
Martijn Laarman	c2a94aabbc	ilm.explain_lifecycle documents human again (#39113 ) (#39648 ) This is already exposed as a `_common.json` global parameter. (cherry picked from commit e84050c0307bb5d5cea8eacc6b63b34248a41a01)	2019-03-04 15:23:01 +01:00
Martijn Laarman	9788036857	metric on watcher stats is a list not an enum (#39114 ) (#39645 ) `enum` is a single option from a known list of `options` `list` is an array of unknown values `flags` are multiple options from a list of known `options`. We don't support the `flags` type but a `list` with `options` acts as one. This is already the case for other API's taking metric such as `node.stats.json`. watcher.stats behaves the same as other API's as `metrics` and as such accepts the following `GET _xpack/watcher/stats/queued_watches,current_watches` (cherry picked from commit 4c00a025b8ac9b397b27c4ae2f799553d6499412)	2019-03-04 15:22:44 +01:00
Martijn Laarman	7c69fd9e44	parts documented as optional are actually required (#39122 ) (#39641 ) (cherry picked from commit e0f728b44ad49e28477767b3ee783a07ddf4bb0d)	2019-03-04 15:22:26 +01:00
David Kyle	a58145f9e6	[ML] Transition to typeless (mapping) APIs (#39573 ) ML has historically used doc as the single mapping type but reindex in 7.x will change the mapping to _doc. Switching to the typeless APIs handles case where the mapping type is either doc or _doc. This change removes deprecated typed usages.	2019-03-04 13:52:05 +00:00
Ioannis Kakavas	2ce9457c8f	Mute Bulk indexing of monitoring data (#39448 ) Relates: #30101	2019-02-28 07:40:36 +02:00
Tim Vernum	30687cbe7f	Switch internal security index to ".security-7" (#39422 ) This changes the name of the internal security index to ".security-7", but supports indices that were upgraded from earlier versions and use the ".security-6" name. In all cases, both ".security-6" and ".security-7" are considered to be restricted index names regardless of which name is actually in use on the cluster. Backport of: #39337	2019-02-27 12:49:44 +11:00
Yogesh Gaikwad	0c7310936b	Fixed required fields and paths list (#39358 ) (#39428 ) Some small fix for the `x-pack` rest api spec. * In both `security.enable_user.json` and `security.disable_user.json` the `username` parameter was `false` instead of `true` (the documentation is already correct). * In `security.get_privileges.json` there were missing all the possible paths since the path parameters are not required. This fix aligns the document with the rest of the spec, where all the possible combinations are listed.	2019-02-27 12:40:15 +11:00
Benjamin Trent	926291aac8	[DATA-FRAME] Sort `GET` transforms and stats by ID (#39365 ) (#39369 ) * [Data-Frame] Sort `GET` transforms and stats by ID * removing unused import	2019-02-25 14:22:41 -06:00
Benjamin Trent	3d49523726	[DATA-FRAME] adds specs and yml tests for existing endpoints (#39326 ) (#39363 ) * [DATA-FRAME] adds specs and yml tests for existing endpoints * removing bad URL, adding test for _all	2019-02-25 11:19:49 -06:00
Benjamin Trent	109b6451fd	ML refactor DatafeedsConfig(Update) so defaults are not populated in queries or aggs (#38822 ) (#39119 ) * ML refactor DatafeedsConfig(Update) so defaults are not populated in queries or aggs * Addressing pr feedback	2019-02-19 12:45:56 -06:00
David Roberts	bbcdea43c5	[ML] Allow stop unassigned datafeed and relax unset upgrade mode wait (#39034 ) These two changes are interlinked. Before this change unsetting ML upgrade mode would wait for all datafeeds to be assigned and not waiting for their corresponding jobs to initialise. However, this could be inappropriate, if there was a reason other that upgrade mode why one job was unable to be assigned or slow to start up. Unsetting of upgrade mode would hang in this case. This change relaxes the condition for considering upgrade mode to be unset to simply that an assignment attempt has been made for each ML persistent task that did not fail because upgrade mode was enabled. Thus after unsetting upgrade mode there is no guarantee that every ML persistent task is assigned, just that each is not unassigned due to upgrade mode. In order to make setting upgrade mode work immediately after unsetting upgrade mode it was then also necessary to make it possible to stop a datafeed that was not assigned. There was no particularly good reason why this was not allowed in the past. It is trivial to stop an unassigned datafeed because it just involves removing the persistent task.	2019-02-19 14:07:10 +00:00
Martijn Laarman	9b4d96534b	Fix #38623 remove xpack namespace REST API (#38625 ) (#39036 ) * Fix #38623 remove xpack namespace REST API Except for xpack.usage and xpack.info API's, this moves the last remaining API's out of the xpack namespace * rename xpack api's inside inside the files as well * updated yaml tests references to xpack namespaces api's * update callsApi calls in the IT subclasses * make sure docs testing does not use xpack namespaced api's * fix leftover xpack namespaced method names in docs/build.gradle * found another leftover reference (cherry picked from commit ccb5d934363c37506b76119ac050a254fa80b5e7)	2019-02-18 12:40:07 +01:00
Albert Zaharovits	6243a9797f	_cat/indices with Security, hide names when wildcard (#38824 ) This changes the output of the `_cat/indices` API with `Security` enabled. It is possible to only display the index name (and possibly the index health, depending on the request options) but not its stats (doc count, merges, size, etc). This is the case for closed indices which have index metadata in the cluster state but no associated shards, hence no shard stats. However, when `Security` is enabled, and the request contains wildcards, open indices without stats are a common occurrence. This is because the index names in the response table are picked up directly from the cluster state which is not filtered by `Security`'s _indexNameExpressionResolver_, unlike the stats data which is populated by the indices stats API which does go through the index name resolver. This is a bug, because it is circumventing `Security`'s function to hide unauthorized indices. This has been fixed by displaying the index names as they are resolved by the indices stats API. The outputs of these two APIs is now very similar: same index names, similar data but different format. Closes #37190	2019-02-14 15:09:17 +02:00
Benjamin Trent	24a8ea06f5	ML: update set_upgrade_mode, add logging (#38372 ) (#38538 ) * ML: update set_upgrade_mode, add logging * Attempt to fix datafeed isolation Also renamed a few methods/variables for clarity and added some comments	2019-02-08 12:56:04 -06:00
Boaz Leskes	033ba725af	Remove support for internal versioning for concurrency control (#38254 ) Elasticsearch has long [supported](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) compare and set (a.k.a optimistic concurrency control) operations using internal document versioning. Sadly that approach is flawed and can sometime do the wrong thing. Here's the relevant excerpt from the resiliency status page: > When a primary has been partitioned away from the cluster there is a short period of time until it detects this. During that time it will continue indexing writes locally, thereby updating document versions. When it tries to replicate the operation, however, it will discover that it is partitioned away. It won’t acknowledge the write and will wait until the partition is resolved to negotiate with the master on how to proceed. The master will decide to either fail any replicas which failed to index the operations on the primary or tell the primary that it has to step down because a new primary has been chosen in the meantime. Since the old primary has already written documents, clients may already have read from the old primary before it shuts itself down. The version numbers of these reads may not be unique if the new primary has already accepted writes for the same document We recently [introduced](https://www.elastic.co/guide/en/elasticsearch/reference/6.x/optimistic-concurrency-control.html) a new sequence number based approach that doesn't suffer from this dirty reads problem. This commit removes support for internal versioning as a concurrency control mechanism in favor of the sequence number approach. Relates to #1078	2019-02-05 20:53:35 +01:00
Julie Tibshirani	3ce7d2c9b6	Make sure to reject mappings with type _doc when include_type_name is false. (#38270 ) `CreateIndexRequest#source(Map<String, Object>, ... )`, which is used when deserializing index creation requests, accidentally accepts mappings that are nested twice under the type key (as described in the bug report #38266). This in turn causes us to be too lenient in parsing typeless mappings. In particular, we accept the following index creation request, even though it should not contain the type key `_doc`: ``` PUT index?include_type_name=false { "mappings": { "_doc": { "properties": { ... } } } } ``` There is a similar issue for both 'put templates' and 'put mappings' requests as well. This PR makes the minimal changes to detect and reject these typed mappings in requests. It does not address #38266 generally, or attempt a larger refactor around types in these server-side requests, as I think this should be done at a later time.	2019-02-05 10:52:32 -08:00
Yogesh Gaikwad	fe36861ada	Add support for API keys to access Elasticsearch (#38291 ) X-Pack security supports built-in authentication service `token-service` that allows access tokens to be used to access Elasticsearch without using Basic authentication. The tokens are generated by `token-service` based on OAuth2 spec. The access token is a short-lived token (defaults to 20m) and refresh token with a lifetime of 24 hours, making them unsuitable for long-lived or recurring tasks where the system might go offline thereby failing refresh of tokens. This commit introduces a built-in authentication service `api-key-service` that adds support for long-lived tokens aka API keys to access Elasticsearch. The `api-key-service` is consulted after `token-service` in the authentication chain. By default, if TLS is enabled then `api-key-service` is also enabled. The service can be disabled using the configuration setting. The API keys:- - by default do not have an expiration but expiration can be configured where the API keys need to be expired after a certain amount of time. - when generated will keep authentication information of the user that generated them. - can be defined with a role describing the privileges for accessing Elasticsearch and will be limited by the role of the user that generated them - can be invalidated via invalidation API - information can be retrieved via a get API - that have been expired or invalidated will be retained for 1 week before being deleted. The expired API keys remover task handles this. Following are the API key management APIs:- 1. Create API Key - `PUT/POST /_security/api_key` 2. Get API key(s) - `GET /_security/api_key` 3. Invalidate API Key(s) `DELETE /_security/api_key` The API keys can be used to access Elasticsearch using `Authorization` header, where the auth scheme is `ApiKey` and the credentials, is the base64 encoding of API key Id and API key separated by a colon. Example:- ``` curl -H "Authorization: ApiKey YXBpLWtleS1pZDphcGkta2V5" http://localhost:9200/_cluster/health ``` Closes #34383	2019-02-05 14:21:57 +11:00
Boaz Leskes	f6e06a2b19	Adapt minimum versions for seq# powered operations in Watch related requests and UpdateRequest (#38231 ) After backporting #37977, #37857 and #37872	2019-02-01 20:37:16 -05:00
Julie Tibshirani	c2e9d13ebd	Default include_type_name to false in the yml test harness. (#38058 ) This PR removes the temporary change we made to the yml test harness in #37285 to automatically set `include_type_name` to `true` in index creation requests if it's not already specified. This is possible now that the vast majority of index creation requests were updated to be typeless in #37611. A few additional tests also needed updating here. Additionally, this PR updates the test harness to set `include_type_name` to `false` in index creation requests when communicating with 6.x nodes. This mirrors the logic added in #37611 to allow for typeless document write requests in test set-up code. With this update in place, we can remove many references to `include_type_name: false` from the yml tests.	2019-02-01 11:44:13 -08:00
Boaz Leskes	b11732104f	Move watcher to use seq# and primary term for concurrency control (#37977 ) * move watcher to seq# occ * top level set * fix parsing and missing setters * share toXContent for PutResponse and rest end point * fix redacted password * fix username reference * fix deactivate-watch.asciidoc have seq no references * add seq# + term to activate-watch.asciidoc * more doc fixes	2019-01-30 20:14:59 -05:00
David Roberts	be788160ef	[ML] Datafeed deprecation checks (#38026 ) Deprecation checks for the ML datafeed query and aggregations.	2019-01-30 20:12:20 +00:00
Benjamin Trent	8280a20664	ML: Add upgrade mode docs, hlrc, and fix bug (#37942 ) * ML: Add upgrade mode docs, hlrc, and fix bug * [DOCS] Fixes build error and edits text * adjusting docs * Update docs/reference/ml/apis/set-upgrade-mode.asciidoc Co-Authored-By: benwtrent <ben.w.trent@gmail.com> * Update set-upgrade-mode.asciidoc * Update set-upgrade-mode.asciidoc	2019-01-30 06:51:11 -06:00
Adrien Grand	c8af0f4bfa	Use mappings to format doc-value fields by default. (#30831 ) Doc-value fields now return a value that is based on the mappings rather than the script implementation by default. This deprecates the special `use_field_mapping` docvalue format which was added in #29639 only to ease the transition to 7.x and it is not necessary anymore in 7.0.	2019-01-30 10:31:51 +01:00
Tim Brooks	00ace369af	Use `CcrRepository` to init follower index (#35719 ) This commit modifies the put follow index action to use a CcrRepository when creating a follower index. It routes the logic through the snapshot/restore process. A wait_for_active_shards parameter can be used to configure how long to wait before returning the response.	2019-01-29 11:47:29 -07:00
Jake Landis	99b75a9bdf	deprecate types for watcher (#37594 ) This commit adds deprecation warnings for index actions and search actions when executed via watcher. Unit and integration tests updated accordingly. relates #35190	2019-01-28 13:46:43 -06:00
Benjamin Trent	7e4c0e6991	ML: Adds set_upgrade_mode API endpoint (#37837 ) * ML: Add MlMetadata.upgrade_mode and API * Adding tests * Adding wait conditionals for the upgrade_mode call to return * Adding tests * adjusting format and tests * Adjusting wait conditions for api return and msgs * adjusting doc tests * adding upgrade mode tests to black list	2019-01-28 09:07:30 -06:00
David Kyle	c0409fb9f0	[ML] Marginal gains in slow multi node QA tests (#37825 ) Move 2 tests that are simple rest tests and out of the QA suite and cut the number of post data calls in ForecastIT	2019-01-28 10:00:59 +00:00
Benjamin Trent	9e932f4869	ML: removing unnecessary upgrade code (#37879 )	2019-01-25 13:57:41 -06:00
Benjamin Trent	5384162a42	ML: creating ML State write alias and pointing writes there (#37483 ) * ML: creating ML State write alias and pointing writes there * Moving alias check to openJob method * adjusting concrete index lookup for ml-state	2019-01-18 14:32:34 -06:00
Martijn van Groningen	6846666b6b	Add ccr follow info api (#37408 ) * Add ccr follow info api This api returns all follower indices and per follower index the provided parameters at put follow / resume follow time and whether index following is paused or active. Closes #37127 * iter * [DOCS] Edits the get follower info API * [DOCS] Fixes link to remote cluster * [DOCS] Clarifies descriptions for configured parameters	2019-01-18 16:37:21 +01:00
Julie Tibshirani	36a3b84fc9	Update the default for include_type_name to false. (#37285 ) * Default include_type_name to false for get and put mappings. * Default include_type_name to false for get field mappings. * Add a constant for the default include_type_name value. * Default include_type_name to false for get and put index templates. * Default include_type_name to false for create index. * Update create index calls in REST documentation to use include_type_name=true. * Some minor clean-ups around the get index API. * In REST tests, use include_type_name=true by default for index creation. * Make sure to use 'expression == false'. * Clarify the different IndexTemplateMetaData toXContent methods. * Fix FullClusterRestartIT#testSnapshotRestore. * Fix the ml_anomalies_default_mappings test. * Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests. We make sure to specify include_type_name=true during xContent parsing, so we continue to test the legacy typed responses. XContent generation for the typeless responses is currently only covered by REST tests, but we will be adding unit test coverage for these as we implement each typeless API in the Java HLRC. This commit also refactors GetMappingsResponse to follow the same appraoch as the other mappings-related responses, where we read include_type_name out of the xContent params, instead of creating a second toXContent method. This gives better consistency in the response parsing code. * Fix more REST tests. * Improve some wording in the create index documentation. * Add a note about types removal in the create index docs. * Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL. * Make sure to mention include_type_name in the REST docs for affected APIs. * Make sure to use 'expression == false' in FullClusterRestartIT. * Mention include_type_name in the REST templates docs.	2019-01-14 13:08:01 -08:00
Martijn van Groningen	1a41d84536	[CCR] Resume follow Api should not require a request body (#37217 ) Closes #37022	2019-01-10 09:48:26 +01:00
Benjamin Trent	df3b58cb04	ML: add migrate anomalies assistant (#36643 ) * ML: add migrate anomalies assistant * adjusting failure handling for reindex * Fixing request and tests * Adding tests to blacklist * adjusting test * test fix: posting data directly to the job instead of relying on datafeed * adjusting API usage * adding Todos and adjusting endpoint * Adding types to reindexRequest * removing unreliable "live" data test * adding index refresh to test * adding index refresh to test * adding index refresh to yaml test * fixing bad exists call * removing todo * Addressing remove comments * Adjusting rest endpoint name * making service have its own logger * adjusting validity check for newindex names * fixing typos * fixing renaming	2019-01-09 14:25:35 -06:00
Jim Ferenczi	e38cf1d0dc	Add the ability to set the number of hits to track accurately (#36357 ) In Lucene 8 searches can skip non-competitive hits if the total hit count is not requested. It is also possible to track the number of hits up to a certain threshold. This is a trade off to speed up searches while still being able to know a lower bound of the total hit count. This change adds the ability to set this threshold directly in the track_total_hits search option. A boolean value (true, false) indicates whether the total hit count should be tracked in the response. When set as an integer this option allows to compute a lower bound of the total hits while preserving the ability to skip non-competitive hits when enough matches have been collected. Relates #33028	2019-01-04 20:36:49 +01:00
Dimitris Athanasiou	586453fef1	[ML] Remove types from datafeed (#36538 ) Closes #34265	2019-01-04 09:43:44 +02:00
Dimitris Athanasiou	b04b3173db	[ML][TEST] Clean up max_model_memory_limit cluster setting (#37101 ) Removes the `xpack.ml.max_model_memory_limit` cluster setting at the teardown of the `ml_info.yml` tests to ensure the setting does not trip other tests.	2019-01-03 15:15:31 +02:00
David Kyle	42bb2bae21	[ML] Order GET job stats response by job id (#36841 )	2019-01-02 16:52:20 +00:00
Ioannis Kakavas	0cae979dfe	Remove bwc logic for token invalidation (#36893 ) - Removes bwc invalidation logic from the TokenService - Removes bwc serialization for InvalidateTokenResponse objects as old nodes in supported mixed clusters during upgrade will be 6.7 and thus will know of the new format - Removes the created field from the TokensInvalidationResult and the InvalidateTokenResponse as it is no longer useful in > 7.0	2018-12-28 13:09:42 +02:00
David Roberts	0f2f00a20a	[ML] Resolve 7.0.0 TODOs in ML code (#36842 ) This change cleans up a number of ugly BWC workarounds in the ML code. 7.0 cannot run in a mixed version cluster with versions prior to 6.7, so code that deals with these old versions is no longer required. Closes #29963	2018-12-20 12:49:57 +00:00
David Kyle	e294056bbf	[ML] Merge the Jindex master feature branch (#36702 ) * [ML] Job and datafeed mappings with index template (#32719) Index mappings for the configuration documents * [ML] Job config document CRUD operations (#32738) * [ML] Datafeed config CRUD operations (#32854) * [ML] Change JobManager to work with Job config in index (#33064) * [ML] Change Datafeed actions to read config from the config index (#33273) * [ML] Allocate jobs based on JobParams rather than cluster state config (#33994) * [ML] Return missing job error when .ml-config is does not exist (#34177) * [ML] Close job in index (#34217) * [ML] Adjust finalize job action to work with documents (#34226) * [ML] Job in index: Datafeed node selector (#34218) * [ML] Job in Index: Stop and preview datafeed (#34605) * [ML] Delete job document (#34595) * [ML] Convert job data remover to work with index configs (#34532) * [ML] Job in index: Get datafeed and job stats from index (#34645) * [ML] Job in Index: Convert get calendar events to index docs (#34710) * [ML] Job in index: delete filter action (#34642) This changes the delete filter action to search for jobs using the filter to be deleted in the index rather than the cluster state. * [ML] Job in Index: Enable integ tests (#34851) Enables the ml integration tests excluding the rolling upgrade tests and a lot of fixes to make the tests pass again. * [ML] Reimplement established model memory (#35500) This is the 7.0 implementation of a master node service to keep track of the native process memory requirement of each ML job with an associated native process. The new ML memory tracker service works when the whole cluster is upgraded to at least version 6.6. For mixed version clusters the old mechanism of established model memory stored on the job in cluster state was used. This means that the old (and complex) code to keep established model memory up to date on the job object has been removed in 7.0. Forward port of #35263 * [ML] Need to wait for shards to replicate in distributed test (#35541) Because the cluster was expanded from 1 node to 3 indices would initially start off with 0 replicas. If the original node was killed before auto-expansion to 1 replica was complete then the test would fail because the indices would be unavailable. * [ML] DelayedDataCheckConfig index mappings (#35646) * [ML] JIndex: Restore finalize job action (#35939) * [ML] Replace Version.CURRENT in streaming functions (#36118) * [ML] Use 'anomaly-detector' in job config doc name (#36254) * [ML] Job In Index: Migrate config from the clusterstate (#35834) Migrate ML configuration from clusterstate to index for closed jobs only once all nodes are v6.6.0 or higher * [ML] Check groups against job Ids on update (#36317) * [ML] Adapt to periodic persistent task refresh (#36633) * [ML] Adapt to periodic persistent task refresh If https://github.com/elastic/elasticsearch/pull/36069/files is merged then the approach for reallocating ML persistent tasks after refreshing job memory requirements can be simplified. This change begins the simplification process. * Remove AwaitsFix and implement TODO * [ML] Default search size for configs * Fix TooManyJobsIT.testMultipleNodes Two problems: 1. Stack overflow during async iteration when lots of jobs on same machine 2. Not effectively setting search size in all cases * Use execute() instead of submit() in MlMemoryTracker We don't need a Future to wait for completion * [ML][TEST] Fix NPE in JobManagerTests * [ML] JIindex: Limit the size of bulk migrations (#36481) * [ML] Prevent updates and upgrade tests (#36649) * [FEATURE][ML] Add cluster setting that enables/disables config migration (#36700) This commit adds a cluster settings called `xpack.ml.enable_config_migration`. The setting is `true` by default. When set to `false`, no config migration will be attempted and non-migrated resources (e.g. jobs, datafeeds) will be able to be updated normally. Relates #32905 * [ML] Snapshot ml configs before migrating (#36645) * [FEATURE][ML] Split in batches and migrate all jobs and datafeeds (#36716) Relates #32905 * SQL: Fix translation of LIKE/RLIKE keywords (#36672) * SQL: Fix translation of LIKE/RLIKE keywords Refactor Like/RLike functions to simplify internals and improve query translation when chained or within a script context. Fix #36039 Fix #36584 * Fixing line length for EnvironmentTests and RecoveryTests (#36657) Relates #34884 * Add back one line removed by mistake regarding java version check and COMPAT jvm parameter existence * Do not resolve addresses in remote connection info (#36671) The remote connection info API leads to resolving addresses of seed nodes when invoked. This is problematic because if a hostname fails to resolve, we would not display any remote connection info. Yet, a hostname not resolving can happen across remote clusters, especially in the modern world of cloud services with dynamically chaning IPs. Instead, the remote connection info API should be providing the configured seed nodes. This commit changes the remote connection info to display the configured seed nodes, avoiding a hostname resolution. Note that care was taken to preserve backwards compatibility with previous versions that expect the remote connection info to serialize a transport address instead of a string representing the hostname. * [Painless] Add boxed type to boxed type casts for method/return (#36571) This adds implicit boxed type to boxed types casts for non-def types to create asymmetric casting relative to the def type when calling methods or returning values. This means that a user calling a method taking an Integer can call it with a Byte, Short, etc. legally which matches the way def works. This creates consistency in the casting model that did not previously exist. * SNAPSHOTS: Adjust BwC Versions in Restore Logic (#36718) * Re-enables bwc tests with adjusted version conditions now that #36397 enables concurrent snapshots in 6.6+ * ingest: fix on_failure with Drop processor (#36686) This commit allows a document to be dropped when a Drop processor is used in the on_failure fork of the processor chain. Fixes #36151 * Initialize startup `CcrRepositories` (#36730) Currently, the CcrRepositoryManger only listens for settings updates and installs new repositories. It does not install the repositories that are in the initial settings. This commit, modifies the manager to install the initial repositories. Additionally, it modifies the ccr integration test to configure the remote leader node at startup, instead of using a settings update. * [TEST] fix float comparison in RandomObjects#getExpectedParsedValue This commit fixes a test bug introduced with #36597. This caused some test failure as stored field values comparisons would not work when CBOR xcontent type was used. Closes #29080 * [Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#35320) This commit exposes lucene's LatLonShape field as the default type in GeoShapeFieldMapper. To use the new indexing approach, simply set "type" : "geo_shape" in the mappings without setting any of the strategy, precision, tree_levels, or distance_error_pct parameters. Note the following when using the new indexing approach: * geo_shape query does not support querying by MULTIPOINT. * LINESTRING and MULTILINESTRING queries do not yet support WITHIN relation. * CONTAINS relation is not yet supported. The tree, precision, tree_levels, distance_error_pct, and points_only parameters are deprecated. * TESTS:Debug Log. IndexStatsIT#testFilterCacheStats * ingest: support default pipelines + bulk upserts (#36618) This commit adds support to enable bulk upserts to use an index's default pipeline. Bulk upsert, doc_as_upsert, and script_as_upsert are all supported. However, bulk script_as_upsert has slightly surprising behavior since the pipeline is executed _before_ the script is evaluated. This means that the pipeline only has access the data found in the upsert field of the script_as_upsert. The non-bulk script_as_upsert (existing behavior) runs the pipeline _after_ the script is executed. This commit does _not_ attempt to consolidate the bulk and non-bulk behavior for script_as_upsert. This commit also adds additional testing for the non-bulk behavior, which remains unchanged with this commit. fixes #36219 * Fix duplicate phrase in shrink/split error message (#36734) This commit removes a duplicate "must be a" from the shrink/split error messages. * Deprecate types in get_source and exist_source (#36426) This change adds a new untyped endpoint `{index}/_source/{id}` for both the GET and the HEAD methods to get the source of a document or check for its existance. It also adds deprecation warnings to RestGetSourceAction that emit a warning when the old deprecated "type" parameter is still used. Also updating documentation and tests where appropriate. Relates to #35190 * Revert "[Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#35320)" This reverts commit `5bc7822562`. * Enhance Invalidate Token API (#35388) This change: - Adds functionality to invalidate all (refresh+access) tokens for all users of a realm - Adds functionality to invalidate all (refresh+access)tokens for a user in all realms - Adds functionality to invalidate all (refresh+access) tokens for a user in a specific realm - Changes the response format for the invalidate token API to contain information about the number of the invalidated tokens and possible errors that were encountered. - Updates the API Documentation After back-porting to 6.x, the `created` field will be removed from master as a field in the response Resolves: #35115 Relates: #34556 * Add raw sort values to SearchSortValues transport serialization (#36617) In order for CCS alternate execution mode (see #32125) to be able to do the final reduction step on the CCS coordinating node, we need to serialize additional info in the transport layer as part of each `SearchHit`. Sort values are already present but they are formatted according to the provided `DocValueFormat` provided. The CCS node needs to be able to reconstruct the lucene `FieldDoc` to include in the `TopFieldDocs` and `CollapseTopFieldDocs` which will feed the `mergeTopDocs` method used to reduce multiple search responses (one per cluster) into one. This commit adds such information to the `SearchSortValues` and exposes it through a new getter method added to `SearchHit` for retrieval. This info is only serialized at transport and never printed out at REST. * Watcher: Ensure all internal search requests count hits (#36697) In previous commits only the stored toXContent version of a search request was using the old format. However an executed search request was already disabling hit counts. In 7.0 hit counts will stay enabled by default to allow for proper migration. Closes #36177 * [TEST] Ensure shard follow tasks have really stopped. Relates to #36696 * Ensure MapperService#getAllMetaFields elements order is deterministic (#36739) MapperService#getAllMetaFields returns an array, which is created out of an `ObjectHashSet`. Such set does not guarantee deterministic hash ordering. The array returned by its toArray may be sorted differently at each run. This caused some repeatability issues in our tests (see #29080) as we pick random fields from the array of possible metadata fields, but that won't be repeatable if the input array is sorted differently at every run. Once setting the tests seed, hppc picks that up and the sorting is deterministic, but failures don't repeat with the seed that gets printed out originally (as a seed was not originally set). See also https://issues.carrot2.org/projects/HPPC/issues/HPPC-173. With this commit, we simply create a static sorted array that is used for `getAllMetaFields`. The change is in production code but really affects only testing as the only production usage of this method was to iterate through all values when parsing fields in the high-level REST client code. Anyways, this seems like a good change as returning an array would imply that it's deterministically sorted. * Expose Sequence Number based Optimistic Concurrency Control in the rest layer (#36721) Relates #36148 Relates #10708 * [ML] Mute MlDistributedFailureIT	2018-12-18 17:45:31 +00:00
Alexander Reelsen	00521a5b36	Watcher: Ensure all internal search requests count hits (#36697 ) In previous commits only the stored toXContent version of a search request was using the old format. However an executed search request was already disabling hit counts. In 7.0 hit counts will stay enabled by default to allow for proper migration. Closes #36177	2018-12-18 10:11:39 +01:00
Ioannis Kakavas	7b9ca62174	Enhance Invalidate Token API (#35388 ) This change: - Adds functionality to invalidate all (refresh+access) tokens for all users of a realm - Adds functionality to invalidate all (refresh+access)tokens for a user in all realms - Adds functionality to invalidate all (refresh+access) tokens for a user in a specific realm - Changes the response format for the invalidate token API to contain information about the number of the invalidated tokens and possible errors that were encountered. - Updates the API Documentation After back-porting to 6.x, the `created` field will be removed from master as a field in the response Resolves: #35115 Relates: #34556	2018-12-18 10:05:50 +02:00
Nik Everett	03daad9812	Re-deprecate xpack rollup endpoints (#36451 ) Redeprecates the `/_xpack/rollup` endpoints in favor of `/_rollup`. When we cleanup the rollup in a cluster containing 6.x nodes we need to use `/_xpack/rollup` instead of `/_rollup` because the 6.x nodes don't know about `/_rollup`. In those cases we must ignore the deprecation warnings that the 7.0 node will return for the end point. Closes #36044	2018-12-11 19:43:17 -05:00
Ioannis Kakavas	d7c5d8049a	Deprecate /_xpack/security/* in favor of /_security/* (#36293 ) * This commit is part of our plan to deprecate and ultimately remove the use of _xpack in the REST APIs. - REST API docs - HLRC docs and doc tests - Handle REST actions with deprecation warnings - Changed endpoints in rest-api-spec and relevant file names	2018-12-11 11:13:10 +02:00
Jason Tedor	0909a631ba	Add non-X-Pack centric rollup endpoints (#36383 ) * Add non-X-Pack centric rollup endpoints This commit adds new endpoints for rollup that do not have _xpack in their path. The purpose for this change is to take these endpoints into 6.x as well so that they can be available in mixed cluster tests too. A follow-up change will deprecate the use of _xpack in the rollup endpoints. And finally, in the future, we would remove the _xpack endpoints. * Remove import * Fix typo	2018-12-10 14:50:30 -05:00
David Turner	ca3f5c1e2e	Cancel GetDiscoveredNodesAction when bootstrapped (#36423 ) Today the `GetDiscoveredNodesAction` waits, possibly indefinitely, to discover enough nodes to bootstrap the cluster. However it is possible that the cluster forms before a node has discovered the expected collection of nodes, in which case the action will wait indefinitely despite the fact that it is no longer required. This commit changes the behaviour so that the action fails once a node receives a cluster state with a nonempty configuration, indicating that the cluster has been successfully bootstrapped and therefore the `GetDiscoveredNodesAction` need wait no longer. Relates #36380 and #36381; reverts `558f4ec278`.	2018-12-10 17:23:03 +00:00
David Roberts	558f4ec278	Ignore zen2 discovery task in waitForPendingTasks (#36381 ) Fixes #36380	2018-12-10 11:07:02 +00:00
Michael Basnight	b5b6e37a60	Deprecate X-Pack centric watcher endpoints (#36218 ) This commit is part of our plan to deprecate and ultimately remove the use of _xpack in the REST APIs. Relates #35958	2018-12-08 12:57:16 -06:00
David Roberts	9e8cfbb40d	[ML] Deprecate X-Pack centric ML endpoints (#36315 ) This commit is part of our plan to deprecate and ultimately remove the use of _xpack in the REST APIs. Relates #35958	2018-12-07 20:34:11 +00:00
Nik Everett	ead2b9e08b	HLRC: Add rollup search (#36334 ) Relates to #29827	2018-12-07 14:39:58 -05:00
Benjamin Trent	adc8355c5d	Adding additional tests for agg parsing in datafeedconfig (#36261 ) * Adding additional tests for agg parsing in datafeedconfig * Fixing bug, adding yml test	2018-12-06 11:19:34 -06:00
Jim Ferenczi	18866c4c0b	Make hits.total an object in the search response (#35849 ) This commit changes the format of the `hits.total` in the search response to be an object with a `value` and a `relation`. The `value` indicates the number of hits that match the query and the `relation` indicates whether the number is accurate (in which case the relation is equals to `eq`) or a lower bound of the total (in which case it is equals to `gte`). This change also adds a parameter called `rest_total_hits_as_int` that can be used in the search APIs to opt out from this change (retrieve the total hits as a number in the rest response). Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain `hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a follow up (to allow numbers to be passed to `track_total_hits`). Relates #33028	2018-12-05 19:49:06 +01:00
Tal Levy	fdec331b13	[ILM] fix ilm.remove_policy rest-spec (#36165 ) The rest interface for remove-policy-from-index API does not support `_ilm/remove`, it requires that an `{index}` pattern be defined in the URL path. This fixes the rest-api-spec to reflect the implementation	2018-12-03 10:55:31 -08:00
Jake Landis	f8f521bad4	Deprecate /_xpack/monitoring/* in favor of /_monitoring/* (#36130 ) This commit is part of our plan to deprecate and ultimately remove the use of _xpack in the REST APIs. * Add deprecation for /_xpack/monitoring/_bulk in favor of /_monitoring/bulk * Removed xpack from the rest-api-spec and tests * Removed xpack from the Action name * Removed MonitoringRestHandler as an unnecessary abstraction * Minor corrections to comments Relates #35958	2018-12-03 10:26:08 -06:00
Tal Levy	78e07d467c	[ILM] rest-spec for remove-policy had wrong link (#36083 )	2018-11-29 14:11:51 -08:00
Zachary Tong	61c2db5ebb	Revert "Deprecate X-Pack centric rollup endpoints (#35962 )" This reverts commit `b84f1f6a3a`.	2018-11-29 12:58:23 -05:00
Jim Ferenczi	8a7f3f75f3	Add support for rest_total_hits_as_int (#36051 ) The support for rest_total_hits_as_int has already been merged to 6x in #35848 so this change adds this new option to master. The plan was to add this new option as part of #35848 but we've decided to wait a few days before merging this breaking change so this commit just handles the new option as a noop exactly like 6x for now. This will allow users to migrate to this parameter before #35848 is merged. Relates #33028	2018-11-29 18:36:16 +01:00
Jason Tedor	9fa9e1419f	Remove X-Pack centric graph endpoints (#36010 ) This commit is part of our plan to deprecate and ultimately remove the use of _xpack in the REST APIs.	2018-11-29 07:09:37 -05:00
Gordon Brown	c26af3b0a2	Deprecate X-Pack centric Migration endpoints (#35976 ) This commit is part of our plan to deprecate and remove the use of _xpack in the REST API routes.	2018-11-28 13:19:33 -07:00
Jason Tedor	a3186e4a32	Deprecate X-Pack centric license endpoints (#35959 ) This commit is part of our plan to deprecate and ultimately remove the use of _xpack in the REST APIs.	2018-11-28 08:24:35 -05:00
Jason Tedor	c42d9d91c9	Deprecate X-Pack centric SQL endpoints (#35964 ) This commit is part of our plan to deprecate and ultimately remove the use of _xpack in the REST APIs.	2018-11-27 22:16:21 -05:00
Jason Tedor	b84f1f6a3a	Deprecate X-Pack centric rollup endpoints (#35962 ) This commit is part of our plan to deprecate and ultimately remove the use of _xpack in the REST APIs.	2018-11-27 20:34:17 -05:00

... 3 4 5 6 7 ...

733 Commits