OpenSearch

Commit Graph

Author	SHA1	Message	Date
Lee Hinman	b85468d6ea	Add node setting for disabling SLM (#46794 ) (#46796 ) This adds the `xpack.slm.enabled` setting to allow disabling of SLM functionality as well as its HTTP API endpoints. Relates to #38461	2019-09-17 17:39:41 -06:00
Oliver Gupte	cbd58d3b78	Give kibana user privileges to create APM agent config index (#46765 ) (#46792 ) * Give kibana user reserved role privileges on .apm-* to create APM agent configuration index. * fixed test to include checking all .apm-* permissions * changed pattern from ".apm-*" to the more specific ".apm-agent-configuration"	2019-09-17 15:01:42 -07:00
Armin Braun	b0f09b279f	Make Snapshot Logic Write Metadata after Segments (#45689 ) (#46764 ) * Write metadata during snapshot finalization after segment files to prevent outdated metadata in case of dynamic mapping updates as explained in #41581 * Keep the old behavior of writing the metadata beforehand in the case of mixed version clusters for BwC reasons * Still overwrite the metadata in the end, so even a mixed version cluster is fixed by this change if a newer version master does the finalization * Fixes #41581	2019-09-17 13:09:39 +02:00
Przemysław Witek	e49be611ad	[7.x] Add audit messages for Data Frame Analytics (#46521 ) (#46738 )	2019-09-16 21:21:38 +02:00
Hendrik Muhs	c8f52ec4ff	[Transform] Rename data frame plugin to transform: classes in xpack.core (#46644 ) (#46734 ) rename classes in xpack.core of transform plugin from "data frame transform" to "transform"	2019-09-16 13:39:22 +02:00
Andrei Dan	c57cca98b2	[ILM] Add date setting to calculate index age (#46561 ) (#46697 ) * [ILM] Add date setting to calculate index age Add the `index.lifecycle.origination_date` to allow users to configure a custom date that'll be used to calculate the index age for the phase transmissions (as opposed to the default index creation date). This could be useful for users to create an index with an "older" origination date when indexing old data. Relates to #42449. * [ILM] Don't override creation date on policy init The initial approach we took was to override the lifecycle creation date if the `index.lifecycle.origination_date` setting was set. This had the disadvantage of the user not being able to update the `origination_date` anymore once set. This commit changes the way we makes use of the `index.lifecycle.origination_date` setting by checking its value when we calculate the index age (ie. at "read time") and, in case it's not set, default to the index creation date. * Make origination date setting index scope dynamic * Document orignation date setting in ilm settings (cherry picked from commit d5bd2bb77ee28c1978ab6679f941d7c02e389d32) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2019-09-16 08:50:28 +01:00
Luca Cavanna	e57756492a	Update http-core and http-client dependencies (#46549 ) Relates to #45808 Closes #45577	2019-09-12 09:45:29 +02:00
James Rodewig	f9bf10f2b6	[DOCS] Change "a SSL" to "an SSL" in the Java docs (#46524 ) (#46618 )	2019-09-11 15:55:57 -04:00
Lee Hinman	09a9cefaa0	Handle partial failure retrieving segments in SegmentCountStep (#46556 ) Since the `IndicesSegmentsRequest` scatters to all shards for the index, it's possible that some of the shards may fail. This adds failure handling and logging (since this is a best-effort step in the first place) for this case.	2019-09-11 10:29:31 -06:00
Jim Ferenczi	23bf310c84	Replace the SearchContext with QueryShardContext when building aggregator factories (#46527 ) This commit replaces the `SearchContext` with the `QueryShardContext` when building aggregator factories. Aggregator factories are part of the `SearchContext` so they shouldn't require a `SearchContext` to create them. The main changes here are the signatures of `AggregationBuilder#build` that now takes a `QueryShardContext` and `AggregatorFactory#createInternal` that passes the `SearchContext` to build the `Aggregator`. Relates #46523	2019-09-11 16:43:30 +02:00
Hendrik Muhs	efea581dcc	[7.x][Transform]Rename data frame plugin to transform: plugin and package names (#46583 ) rename data frame transform plugin to transform: - rename plugin data-frame to transform - change all package names from o.e..dataframe. to o.e..transform. - necessary changes to fix loading/testing	2019-09-11 14:50:08 +02:00
Armin Braun	41633cb9b5	More Efficient Ordering of Shard Upload Execution (#42791 ) (#46588 ) * More Efficient Ordering of Shard Upload Execution (#42791) * Change the upload order of of snapshots to work file by file in parallel on the snapshot pool instead of merely shard-by-shard * Inspired by #39657 * Cleanup BlobStoreRepository Abort and Failure Handling (#46208)	2019-09-11 13:59:20 +02:00
Jim Ferenczi	425b1a77e8	Add more context to QueryShardContext (#46584 ) This change adds an IndexSearcher and the node's BigArrays in the QueryShardContext. It's a spin off of #46527 as this change is required to allow aggregation builder to solely use the query shard context. Relates #46523	2019-09-11 12:24:51 +02:00
Przemysław Witek	e38e631dac	[7.x] Implement DataFrameAnalyticsAuditMessage and DataFrameAnalyticsAuditor (#45967 ) (#46519 )	2019-09-11 12:17:26 +02:00
Lee Hinman	cdc3a260af	Add retention to Snapshot Lifecycle Management (backport of #4… (#46506 ) * Add retention to Snapshot Lifecycle Management (#46407) This commit adds retention to the existing Snapshot Lifecycle Management feature (#38461) as described in #43663. This allows a user to configure SLM to automatically delete older snapshots based on a number of criteria. An example policy would look like: ``` PUT /_slm/policy/snapshot-every-day { "schedule": "0 30 2 * * ?", "name": "<production-snap-{now/d}>", "repository": "my-s3-repository", "config": { "indices": ["foo-", "important"] }, // Newly configured retention options "retention": { // Snapshots should be deleted after 14 days "expire_after": "14d", // Keep a maximum of thirty snapshots "max_count": 30, // Keep a minimum of the four most recent snapshots "min_count": 4 } } ``` SLM Retention is run on a scheduled configurable with the `slm.retention_schedule` setting, which supports cron expressions. Deletions are run for a configurable time bounded by the `slm.retention_duration` setting, which defaults to 1 hour. Included in this work is a new SLM stats API endpoint available through ``` json GET /_slm/stats ``` That returns statistics about snapshot taken and deleted, as well as successful retention runs, failures, and the time spent deleting snapshots. #45362 has more information as well as an example of the output. These stats are also included when retrieving SLM policies via the API. Add base framework for snapshot retention (#43605) * Add base framework for snapshot retention This adds a basic `SnapshotRetentionService` and `SnapshotRetentionTask` to start as the basis for SLM's retention implementation. Relates to #38461 * Remove extraneous 'public' * Use a local var instead of reading class var repeatedly * Add SnapshotRetentionConfiguration for retention configuration (#43777) * Add SnapshotRetentionConfiguration for retention configuration This commit adds the `SnapshotRetentionConfiguration` class and its HLRC counterpart to encapsulate the configuration for SLM retention. Currently only a single parameter is supported as an example (we still need to discuss the different options we want to support and their names) to keep the size of the PR down. It also does not yet include version serialization checks since the original SLM branch has not yet been merged. Relates to #43663 * Fix REST tests * Fix more documentation * Use Objects.equals to avoid NPE * Put `randomSnapshotLifecyclePolicy` in only one place * Occasionally return retention with no configuration * Implement SnapshotRetentionTask's snapshot filtering and delet… (#44764) * Implement SnapshotRetentionTask's snapshot filtering and deletion This commit implements the snapshot filtering and deletion for `SnapshotRetentionTask`. Currently only the expire-after age is used for determining whether a snapshot is eligible for deletion. Relates to #43663 * Fix deletes running on the wrong thread * Handle missing or null policy in snap metadata differently * Convert Tuple<String, List<SnapshotInfo>> to Map<String, List<SnapshotInfo>> * Use the `OriginSettingClient` to work with security, enhance logging * Prevent NPE in test by mocking Client * Allow empty/missing SLM retention configuration (#45018) Semi-related to #44465, this allows the `"retention"` configuration map to be missing. Relates to #43663 * Add min_count and max_count as SLM retention predicates (#44926) This adds the configuration options for `min_count` and `max_count` as well as the logic for determining whether a snapshot meets this criteria to SLM's retention feature. These options are optional and one, two, or all three can be specified in an SLM policy. Relates to #43663 * Time-bound deletion of snapshots in retention delete function (#45065) * Time-bound deletion of snapshots in retention delete function With a cluster that has a large number of snapshots, it's possible that snapshot deletion can take a very long time (especially since deletes currently have to happen in a serial fashion). To prevent snapshot deletion from taking forever in a cluster and blocking other operations, this commit adds a setting to allow configuring a maximum time to spend deletion snapshots during retention. This dynamic setting defaults to 1 hour and is best-effort, meaning that it doesn't hard stop a deletion at an hour mark, but ensures that once the time has passed, all subsequent deletions are deferred until the next retention cycle. Relates to #43663 * Wow snapshots suuuure can take a long time. * Use a LongSupplier instead of actually sleeping * Remove TestLogging annotation * Remove rate limiting * Add SLM metrics gathering and endpoint (#45362) * Add SLM metrics gathering and endpoint This commit adds the infrastructure to gather metrics about the different SLM actions that a cluster takes. These actions are stored in `SnapshotLifecycleStats` and perpetuated in cluster state. The stats stored include the number of snapshots taken, failed, deleted, the number of retention runs, as well as per-policy counts for snapshots taken, failed, and deleted. It also includes the amount of time spent deleting snapshots from SLM retention. This commit also adds an endpoint for retrieving all stats (further commits will expose this in the SLM get-policy API) that looks like: ``` GET /_slm/stats { "retention_runs" : 13, "retention_failed" : 0, "retention_timed_out" : 0, "retention_deletion_time" : "1.4s", "retention_deletion_time_millis" : 1404, "policy_metrics" : { "daily-snapshots2" : { "snapshots_taken" : 7, "snapshots_failed" : 0, "snapshots_deleted" : 6, "snapshot_deletion_failures" : 0 }, "daily-snapshots" : { "snapshots_taken" : 12, "snapshots_failed" : 0, "snapshots_deleted" : 12, "snapshot_deletion_failures" : 6 } }, "total_snapshots_taken" : 19, "total_snapshots_failed" : 0, "total_snapshots_deleted" : 18, "total_snapshot_deletion_failures" : 6 } ``` This does not yet include HLRC for this, as this commit is quite large on its own. That will be added in a subsequent commit. Relates to #43663 * Version qualify serialization * Initialize counters outside constructor * Use computeIfAbsent instead of being too verbose * Move part of XContent generation into subclass * Fix REST action for master merge * Unused import * Record history of SLM retention actions (#45513) This commit records the deletion of snapshots by the retention component of SLM into the SLM history index for the purposes of reviewing operations taken by SLM and alerting. * Retry SLM retention after currently running snapshot completes (#45802) * Retry SLM retention after currently running snapshot completes This commit adds a ClusterStateObserver to wait until the currently running snapshot is complete before proceeding with snapshot deletion. SLM retention waits for the maximum allowed deletion time for the snapshot to complete, however, the waiting time is not factored into the limit on actual deletions. Relates to #43663 * Increase timeout waiting for snapshot completion * Apply patch From `2374316f0d`.patch * Rename test variables * [TEST] Be less strict for stats checking * Skip SLM retention if ILM is STOPPING or STOPPED (#45869) This adds a check to ensure we take no action during SLM retention if ILM is currently stopped or in the process of stopping. Relates to #43663 * Check all actions preventing snapshot delete during retention (#45992) * Check all actions preventing snapshot delete during retention run Previously we only checked to see if a snapshot was currently running, but it turns out that more things can block snapshot deletion. This changes the check to be a check for: - a snapshot currently running - a deletion already in progress - a repo cleanup in progress - a restore currently running This was found by CI where a third party delete in a test caused SLM retention deletion to throw an exception. Relates to #43663 * Add unit test for okayToDeleteSnapshots * Fix bug where SLM retention task would be scheduled on every node * Enhance test logging * Ignore if snapshot is already deleted * Missing import * Fix SnapshotRetentionServiceTests * Expose SLM policy stats in get SLM policy API (#45989) This also adds support for the SLM stats endpoint to the high level rest client. Retrieving a policy now looks like: ```json { "daily-snapshots" : { "version": 1, "modified_date": "2019-04-23T01:30:00.000Z", "modified_date_millis": 1556048137314, "policy" : { "schedule": "0 30 1 * * ?", "name": "<daily-snap-{now/d}>", "repository": "my_repository", "config": { "indices": ["data-", "important"], "ignore_unavailable": false, "include_global_state": false }, "retention": {} }, "stats": { "snapshots_taken": 0, "snapshots_failed": 0, "snapshots_deleted": 0, "snapshot_deletion_failures": 0 }, "next_execution": "2019-04-24T01:30:00.000Z", "next_execution_millis": 1556048160000 } } ``` Relates to #43663 Rewrite SnapshotLifecycleIT as as ESIntegTestCase (#46356) * Rewrite SnapshotLifecycleIT as as ESIntegTestCase This commit splits `SnapshotLifecycleIT` into two different tests. `SnapshotLifecycleRestIT` which includes the tests that do not require slow repositories, and `SLMSnapshotBlockingIntegTests` which is now an integration test using `MockRepository` to simulate a snapshot being in progress. Relates to #43663 Resolves #46205 * Add error logging when exceptions are thrown * Update serialization versions * Fix type inference * Use non-Cancellable HLRC return value * Fix Client mocking in test * Fix SLMSnapshotBlockingIntegTests for 7.x branch * Update SnapshotRetentionTask for non-multi-repo snapshot retrieval * Add serialization guards for SnapshotLifecyclePolicy	2019-09-10 09:08:09 -06:00
Benjamin Trent	457ff3e2fb	7.x/ml fix instance serialization bwc (#46404 ) * [ML] Fixing instance serialization version for bwc * fixing CppLogMessage	2019-09-05 13:23:26 -05:00
Andrey Ershov	ece9eb4acd	Remove stack trace logging in Security(Transport\|Http)ExceptionHandler (#45966 ) As per #45852 comment we no longer need to log stack-traces in SecurityTransportExceptionHandler and SecurityHttpExceptionHandler even if trace logging is enabled. (cherry picked from commit c99224a32d26db985053b7b36e2049036e438f97)	2019-09-04 11:50:35 +03:00
Benjamin Trent	53df54c703	[ML][Transforms] fixing stop on changes check bug (#46162 ) (#46273 ) * [ML][Transforms] fixing stop on changes check bug * Adding new method finishAndCheckState to cover race conditions in early terminations * changing stopping conditions in `onStart` * allow indexer to finish when exiting early	2019-09-03 11:04:18 -05:00
Lee Hinman	3d4b8e01c7	Validate SLM policy ids strictly (#45998 ) (#46145 ) This uses strict validation for SLM policy ids, similar to what we use for index names. Resolves #45997	2019-09-03 09:20:02 -06:00
David Roberts	ab045744ac	[ML-DataFrame] Fix off-by-one error in checkpoint operations_behind (#46235 ) Fixes a problem where operations_behind would be one less than expected per shard in a new index matched by the data frame transform source pattern. For example, if a data frame transform had a source of foo* and a new index foo-new was created with 2 shards and 7 documents indexed in it then operations_behind would be 5 prior to this change. The problem was that an empty index has a global checkpoint number of -1 and the sequence number of the first document that is indexed into an index is 0, not 1. This doesn't matter for indices included in both the last and next checkpoints, as the off-by-one errors cancelled, but for a new index it affected the observed result.	2019-09-03 12:45:02 +01:00
Benjamin Trent	d0c5573a51	[ML] Throw an error when a datafeed needs CCS but it is not enabled for the node (#46044 ) (#46096 ) Though we allow CCS within datafeeds, users could prevent nodes from accessing remote clusters. This can cause mysterious errors and difficult to troubleshoot. This commit adds a check to verify that `cluster.remote.connect` is enabled on the current node when a datafeed is configured with a remote index pattern.	2019-08-30 09:27:07 -05:00
Dimitris Athanasiou	5921ae53d8	[7.x][ML] Regression dependent variable must be numeric (#46072 ) (#46136 ) * [ML] Regression dependent variable must be numeric This adds a validation that the dependent variable of a regression analysis must be numeric. * Address review comments and fix some problems In addition to addressing the review comments, this commit fixes a few issues I found during testing. In particular: - if there were mappings for required fields but they were not included we were not reporting the error - if explicitly included fields had unsupported types we were not reporting the error Unfortunately, I couldn't get those fixed without refactoring the code in `ExtractedFieldsDetector`.	2019-08-30 09:57:43 +03:00
Zachary Tong	cf8a4171e1	Rename `data-science` plugin to `analytics` (#46133 ) Rename `data-science` plugin to `analytics`. Also removes enabled flag. Backport of #46092	2019-08-29 12:45:39 -04:00
Przemysław Witek	b8a0379057	Refactor auditor-related classes (#45893 ) (#46120 )	2019-08-29 14:21:03 +02:00
Gordon Brown	47bbd9d9a9	[7.x] Fix rollover alias in SLM history index template (#46001 ) This commit adds the `rollover_alias` setting required for ILM to work correctly to the SLM history index template and adds assertions to the SLM integration tests to ensure that it works correctly.	2019-08-28 14:50:22 -07:00
Jake Landis	154d1dd962	Watcher max_iterations with foreach action execution (#45715 ) (#46039 ) Prior to this commit the foreach action execution had a hard coded limit to 100 iterations. This commit allows the max number of iterations to be a configuration ('max_iterations') on the foreach action. The default remains 100.	2019-08-27 16:57:20 -05:00
Armin Braun	fdef293c81	Fix RegressionTests#fromXContent (#46029 ) * The `trainingPercent` must be between `1` and `100`, not `0` and `100` which is causing test failures	2019-08-27 18:24:26 +03:00
Dimitris Athanasiou	873ad3f942	[7.x][ML] Add option to regression to randomize training set (#45969 ) (#46017 ) Adds a parameter `training_percent` to regression. The default value is `100`. When the parameter is set to a value less than `100`, from the rows that can be used for training (ie. those that have a value for the dependent variable) we randomly choose whether to actually use for training. This enables splitting the data into a training set and the rest, usually called testing, validation or holdout set, which allows for validating the model on data that have not been used for training. Technically, the analytics process considers as training the data that have a value for the dependent variable. Thus, when we decide a training row is not going to be used for training, we simply clear the row's dependent variable.	2019-08-27 17:53:11 +03:00
Yogesh Gaikwad	7b6246ec67	Add `manage_own_api_key` cluster privilege (#45897 ) (#46023 ) The existing privilege model for API keys with privileges like `manage_api_key`, `manage_security` etc. are too permissive and we would want finer-grained control over the cluster privileges for API keys. Previously APIs created would also need these privileges to get its own information. This commit adds support for `manage_own_api_key` cluster privilege which only allows api key cluster actions on API keys owned by the currently authenticated user. Also adds support for retrieval of the API key self-information when authenticating via API key without the need for the additional API key privileges. To support this privilege, we are introducing additional authentication context along with the request context such that it can be used to authorize cluster actions based on the current user authentication. The API key get and invalidate APIs introduce an `owner` flag that can be set to true if the API key request (Get or Invalidate) is for the API keys owned by the currently authenticated user only. In that case, `realm` and `username` cannot be set as they are assumed to be the currently authenticated ones. The changes cover HLRC changes, documentation for the API changes. Closes #40031	2019-08-28 00:44:23 +10:00
Dimitris Athanasiou	dd6c13fdf9	[ML] Add description to DF analytics (#45774 ) (#46019 )	2019-08-27 15:48:59 +03:00
Albert Zaharovits	1ebee5bf9b	PKI realm authentication delegation (#45906 ) This commit introduces PKI realm delegation. This feature supports the PKI authentication feature in Kibana. In essence, this creates a new API endpoint which Kibana must call to authenticate clients that use certificates in their TLS connection to Kibana. The API call passes to Elasticsearch the client's certificate chain. The response contains an access token to be further used to authenticate as the client. The client's certificates are validated by the PKI realms that have been explicitly configured to permit certificates from the proxy (Kibana). The user calling the delegation API must have the delegate_pki privilege. Closes #34396	2019-08-27 14:42:46 +03:00
Zachary Tong	943a016bb2	Add Cumulative Cardinality agg (and Data Science plugin) (#45990 ) This adds a pipeline aggregation that calculates the cumulative cardinality of a field. It does this by iteratively merging in the HLL sketch from consecutive buckets and emitting the cardinality up to that point. This is useful for things like finding the total "new" users that have visited a website (as opposed to "repeat" visitors). This is a Basic+ aggregation and adds a new Data Science plugin to house it and future advanced analytics/data science aggregations.	2019-08-26 16:19:55 -04:00
Andrey Ershov	479ab9b8db	Fix plaintext on TLS port logging (#45852 ) Today if non-TLS record is received on TLS port generic exception will be logged with the stack-trace. SSLExceptionHelper.isNotSslRecordException method does not work because it's assuming that NonSslRecordException would be top-level. This commit addresses the issue and the log would be more concise. (cherry picked from commit 6b83527bf0c23d4d5b97fab7f290c43432945d4f)	2019-08-26 12:32:35 +02:00
Ioannis Kakavas	2bee27dd54	Allow Transport Actions to indicate authN realm (#45946 ) This commit allows the Transport Actions for the SSO realms to indicate the realm that should be used to authenticate the constructed AuthenticationToken. This is useful in the case that many authentication realms of the same type have been configured and where the caller of the API(Kibana or a custom web app) already know which realm should be used so there is no need to iterate all the realms of the same type. The realm parameter is added in the relevant REST APIs as optional so as not to introduce any breaking change.	2019-08-25 19:36:41 +03:00
Dimitris Athanasiou	be554fe5f0	[7.x][ML] Improve progress reportings for DF analytics (#45856 ) (#45910 ) Previously, the stats API reports a progress percentage for DF analytics tasks that are running and are in the `reindexing` or `analyzing` state. This means that when the task is `stopped` there is no progress reported. Thus, one cannot distinguish between a task that never run to one that completed. In addition, there are blind spots in the progress reporting. In particular, we do not account for when data is loaded into the process. We also do not account for when results are written. This commit addresses the above issues. It changes progress to being a list of objects, each one describing the phase and its progress as a percentage. We currently have 4 phases: reindexing, loading_data, analyzing, writing_results. When the task stops, progress is persisted as a document in the state index. The stats API now reports progress from in-memory if the task is running, or returns the persisted document (if there is one).	2019-08-23 23:04:39 +03:00
markharwood	217e41ab6c	Search - added HLRC support for PinnedQueryBuilder (#45779 ) (#45853 ) Added HLRC support for PinnedQueryBuilder Related #44074	2019-08-23 09:22:17 +01:00
Tim Vernum	f94e4a9151	Set security index refresh interval to 1s (#45888 ) The security indices were being created without specifying the refresh interval, which means it would inherit a value from any templates that exists. However, certain security functionality depends on being able to wait_for refresh, and causes errors (e.g. in Kibana) if that time exceeds 30s. This commit changes the security indices configuration to always be created with a 1s refresh interval. This prevents any templates from inadvertantly interfering with the proper functioning of security. It is possible for an administrator to explicitly change the refresh interval after the indices have been created. Backport of: #45434	2019-08-23 12:41:37 +10:00
Tim Vernum	029725fc35	Add SSL/TLS settings for watcher email (#45836 ) This change adds a new SSL context xpack.notification.email.ssl.* that supports the standard SSL configuration settings (truststore, verification_mode, etc). This SSL context is used when configuring outbound SMTP properties for watcher email notifications. Backport of: #45272	2019-08-23 10:13:51 +10:00
Benjamin Trent	8e3c54fff7	[7.x] [ML] Adding data frame analytics stats to _usage API (#45820 ) (#45872 ) * [ML] Adding data frame analytics stats to _usage API (#45820) * [ML] Adding data frame analytics stats to _usage API * making the size of analytics stats 10k * adjusting backport	2019-08-22 15:15:41 -05:00
Benjamin Trent	e50a78cf50	[ML-DataFrame] version data frame transform internal index (#45375 ) (#45837 ) Adds index versioning for the internal data frame transform index. Allows for new indices to be created and referenced, `GET` requests now query over the index pattern and takes the latest doc (based on INDEX name).	2019-08-22 11:46:30 -05:00
Przemysław Witek	7512337922	[7.x] Allow the user to specify 'query' in Evaluate Data Frame request (#45775 ) (#45825 )	2019-08-22 11:14:26 +02:00
Gordon Brown	47b1e2b3d0	[7.x] Use rollover for SLM's history indices (#45686 ) Following our own guidelines, SLM should use rollover instead of purely time-based indices to keep shard counts low. This commit implements lazy index creation for SLM's history indices, indexing via an alias, and rollover in the built-in ILM policy.	2019-08-21 13:42:11 -06:00
Przemysław Witek	bf701b83d2	Shorten field names in EstimateMemoryUsageResponse (#45719 ) (#45772 )	2019-08-21 12:45:09 +02:00
Dimitris Athanasiou	d5c3d9b50f	[7.x][ML] Do not skip rows with missing values for regression (#45751 ) (#45754 ) Regression analysis support missing fields. Even more, it is expected that the dependent variable has missing fields to the part of the data frame that is not for training. This commit allows to declare that an analysis supports missing values. For such analysis, rows with missing values are not skipped. Instead, they are written as normal with empty strings used for the missing values. This also contains a fix to the integration test. Closes #45425	2019-08-21 08:15:38 +03:00
Benjamin Trent	ba7b677618	[ML] better handle empty results when evaluating regression (#45745 ) (#45759 ) * [ML] better handle empty results when evaluating regression * adding new failure test to ml_security black list * fixing equality check for regression results	2019-08-20 17:37:04 -05:00
Przemysław Witek	b37ebd1adf	Prepare the codebase for new Auditor subclasses (#45716 ) (#45731 )	2019-08-20 16:03:50 +02:00
Przemysław Witek	80dd0a0948	Get rid of EstimateMemoryUsageRequest and EstimateMemoryUsageAction.Request. (#45718 ) (#45725 )	2019-08-20 15:49:17 +02:00
Benjamin Trent	88641a08af	[ML][Data frame] fixing failure state transitions and race condition (#45627 ) (#45656 ) * [ML][Data frame] fixing failure state transitions and race condition (#45627) There is a small window for a race condition while we are flagging a task as failed. Here are the steps where the race condition occurs: 1. A failure occurs 2. Before `AsyncTwoPhaseIndexer` calls the `onFailure` handler it does the following: a. `finishAndSetState()` which sets the IndexerState to STARTED b. `doSaveState(...)` which attempts to save the current state of the indexer 3. Another trigger is fired BEFORE `onFailure` can fire, but AFTER `finishAndSetState()` occurs. The trick here is that we will eventually set the indexer to failed, but possibly not before another trigger had the opportunity to fire. This could obviously cause some weird state interactions. To combat this, I have put in some predicates to verify the state before taking actions. This is so if state is indeed marked failed, the "second trigger" stops ASAP. Additionally, I move the task state checks INTO the `start` and `stop` methods, which will now require a `force` parameter. `start`, `stop`, `trigger` and `markAsFailed` are all `synchronized`. This should gives us some guarantees that one will not switch states out from underneath another. I also flag the task as `failed` BEFORE we successfully write it to cluster state, this is to allow us to make the task fail more quickly. But, this does add the behavior where the task is "failed" but the cluster state does not indicate as much. Adding the checks in `start` and `stop` will handle this "real state vs cluster state" race condition. This has always been a problem for `_stop` as it is not a master node action and doesn’t always have the latest cluster state. closes #45609 Relates to #45562 * [ML][Data Frame] moves failure state transition for MT safety (#45676) * [ML][Data Frame] moves failure state transition for MT safety * removing unused imports	2019-08-20 07:30:17 -05:00
Benjamin Trent	fde5dae387	[ML][Data Frame] adjusting change detection workflow (#45511 ) (#45580 ) * [ML][Data Frame] adjusting change detection workflow * adjusting for PR comment * disallowing null as an argument value	2019-08-14 17:26:24 -05:00
Nick Knize	647a8308c3	[SPATIAL] Backport new ShapeFieldMapper and ShapeQueryBuilder to 7x (#45363 ) * Introduce Spatial Plugin (#44389) Introduce a skeleton Spatial plugin that holds new licensed features coming to Geo/Spatial land! * [GEO] Refactor DeprecatedParameters in AbstractGeometryFieldMapper (#44923) Refactor DeprecatedParameters specific to legacy geo_shape out of AbstractGeometryFieldMapper.TypeParser#parse. * [SPATIAL] New ShapeFieldMapper for indexing cartesian geometries (#44980) Add a new ShapeFieldMapper to the xpack spatial module for indexing arbitrary cartesian geometries using a new field type called shape. The indexing approach leverages lucene's new XYShape field type which is backed by BKD in the same manner as LatLonShape but without the WGS84 latitude longitude restrictions. The new field mapper builds on and extends the refactoring effort in AbstractGeometryFieldMapper and accepts shapes in either GeoJSON or WKT format (both of which support non geospatial geometries). Tests are provided in the ShapeFieldMapperTest class in the same manner as GeoShapeFieldMapperTests and LegacyGeoShapeFieldMapperTests. Documentation for how to use the new field type and what parameters are accepted is included. The QueryBuilder for searching indexed shapes is provided in a separate commit. * [SPATIAL] New ShapeQueryBuilder for querying indexed cartesian geometry (#45108) Add a new ShapeQueryBuilder to the xpack spatial module for querying arbitrary Cartesian geometries indexed using the new shape field type. The query builder extends AbstractGeometryQueryBuilder and leverages the ShapeQueryProcessor added in the previous field mapper commit. Tests are provided in ShapeQueryTests in the same manner as GeoShapeQueryTests and docs are updated to explain how the query works.	2019-08-14 16:35:10 -05:00

1 2 3 4 5 ...

1281 Commits