OpenSearch

Commit Graph

Author	SHA1	Message	Date
Benjamin Trent	5db9982f71	[7.x] [ML][Data Frame] Add update transform api endpoint (#45154 ) (#45279 ) * [ML][Data Frame] Add update transform api endpoint (#45154) This adds the ability to `_update` stored data frame transforms. All mutable fields are applied when the next checkpoint starts. The exception being `description`. This PR contains all that is necessary for this addition: * HLRC * Docs * Server side	2019-08-07 10:37:35 -05:00
Benjamin Trent	3a71b91dca	[ML][Data Frame] add support for geo_bounds aggregation (#44441 ) (#45281 ) This adds support for `geo_bounds` aggregation inside the `pivot.aggregations` configuration. The two points returned from the `geo_bounds` aggregation are transformed into `geo_shape` whose types are dynamic given the point's similarity. * `point` if the two points are identical * `linestring` if the two points share either a latitude or longitude * `polygon` if the two points are completely different The automatically deduced mapping for the resulting field is a `geo_shape`.	2019-08-07 10:37:09 -05:00
Benjamin Trent	be911e6a53	[ML][Data Frames] Fix null aggregation handling in indexer (#45061 ) (#45257 ) * [ML][Data Frames] Fix null aggregation handling in indexer * addressing PR comments * adjusting error messages	2019-08-07 07:01:13 -05:00
Hendrik Muhs	6b5a2513a9	[ML-DataFrame] introduce an abstraction for checkpointing (#44900 ) introduces an abstraction for how checkpointing and synchronization works, covering - retrieval of checkpoints - check for updates - retrieving stats information	2019-08-06 07:38:59 +02:00
Benjamin Trent	7bfaba98c2	[ML][Data Frame] cleaning up and adjusting failure tests (#45101 ) (#45144 )	2019-08-05 09:12:11 -05:00
Benjamin Trent	3f48720d41	[ML][Data Frames] unify validation exceptions between PUT/_preview (#44983 ) (#45012 ) * [ML][Data Frames] unify validation exceptions between PUT/_preview * addressing PR comments	2019-07-30 13:05:07 -05:00
Benjamin Trent	22feedf289	[ML][Data Frame] add support for bucket_selector (#44718 ) (#45008 )	2019-07-30 11:32:58 -05:00
David Roberts	b2e969f4ba	[ML-DataFrame] Remove ID field from data frame indexer stats (#44848 ) This is a followup to #44350. The indexer stats used to be persisted standalone, but now are only persisted as part of a state-and-stats document. During the review of #44350 it was decided that we'll stick with this design, so there will never be a need for an indexer stats object to store its transform ID as it is stored on the enclosing document. This PR removes the indexer stats document ID. Backport of #44768	2019-07-25 15:19:32 +01:00
David Roberts	caf9411a72	[ML] Improve response format of data frame stats endpoint (#44743 ) This change adjusts the data frame transforms stats endpoint to return a structure that is easier to understand. This is a breaking change for clients of the data frame transforms stats endpoint, but the feature is in beta so stability is not guaranteed. Backport of #44350	2019-07-23 18:00:50 +01:00
Benjamin Trent	6f53865fde	[ML][Data Frame] Fixes failure state tests and failure setting handling (#44645 ) (#44698 ) * [ML][Data Frame] fixing flaky test * adjusting frequency * fixing tests * addressing PR comments	2019-07-23 08:33:12 -05:00
Benjamin Trent	4456850a8e	[7.x] [ML][Data Frame] Add optional defer_validation param to PUT (#44455 ) (#44697 ) * [ML][Data Frame] Add optional defer_validation param to PUT (#44455) * [ML][Data Frame] Add optional defer_validation param to PUT * addressing PR comments * reverting bad replace * addressing pr comments * Update put-transform.asciidoc * Update put-transform.asciidoc * Update put-transform.asciidoc * adjusting for backport * fixing imports * [DOCS] Fixes formatting in create data frame transform API	2019-07-22 15:12:55 -05:00
Benjamin Trent	06e21f7902	[7.x] [ML][Data Frame] adding force delete (#44590 ) (#44696 ) * [ML][Data Frame] adding force delete (#44590) * [ML][Data Frame] adding force delete * Update delete-transform.asciidoc * adjusting for backport	2019-07-22 13:13:25 -05:00
Benjamin Trent	a948362d0a	[7.x] [ML][Data Frame] deregister scheduler on transform failure (#44569 ) (#44576 ) * [ML][Data Frame] deregister scheduler on transform failure (#44569) * fixing test * Update DataFrameRestTestCase.java * Update DataFrameTaskFailedStateIT.java * Update DataFramePivotRestIT.java	2019-07-22 09:06:48 -05:00
David Roberts	6d27eec30f	[ML-DataFrame] Use lenient expand open in data frame searches (#44633 ) Since #44344 we use IndicesOptions.LENIENT_EXPAND_OPEN when deciding which indices to include in checkpoint calculation. This change uses the same option when deciding which indices to search for data and which indices to get mappings from, otherwise there is a potential mismatch between the checkpoint details and what is searched elsewhere.	2019-07-22 11:30:33 +01:00
Benjamin Trent	2e303fc5f7	[ML][Data Frame] adding dynamic cluster setting for failure retries (#44577 ) (#44639 ) This adds a new dynamic cluster setting `xpack.data_frame.num_transform_failure_retries`. This setting indicates how many times non-critical failures should be retried before a data frame transform is marked as failed and should stop executing. At the time of this commit; Min: 0, Max: 100, Default: 10	2019-07-19 16:17:39 -05:00
Ryan Ernst	f193d14764	Convert remaining Action Response/Request to writeable.reader (#44528 ) (#44607 ) This commit converts readFrom to ctor with StreamInput on the remaining ActionResponse and ActionRequest classes. relates #34389	2019-07-19 13:33:38 -07:00
Benjamin Trent	3477f5ae04	muting test testBulkIndexFailuresCauseTaskToFail (#44594 )	2019-07-18 15:03:50 -05:00
Benjamin Trent	d5ca72740e	[ML][Data Frame] adjust onFinish audit frequency (#44450 ) (#44508 )	2019-07-18 14:28:34 -05:00
Benjamin Trent	858dbfc074	[ML][Data Frame] treat bulk index failures as an indexing failure (#44351 ) (#44427 ) * [ML][Data Frame] treat bulk index failures as an indexing failure * removing redundant public modifier * changing to an ElasticsearchException * fixing redundant public modifier	2019-07-16 10:04:28 -05:00
Hendrik Muhs	6c1f740759	[ML-DataFrame] make checkpointing more robust (#44344 ) (#44414 ) make checkpointing more robust: - do not let checkpointing fail if indexes got deleted - treat missing seqNoStats as just created indices (checkpoint 0) - loglevel: do not treat failed updated checks as error fixes #43992	2019-07-16 13:43:13 +02:00
Ryan Ernst	7e06888bae	Convert testclusters to use distro download plugin (#44253 ) (#44362 ) Test clusters currently has its own set of logic for dealing with finding different versions of Elasticsearch, downloading them, and extracting them. This commit converts testclusters to use the DistributionDownloadPlugin.	2019-07-15 17:53:05 -07:00
Ryan Ernst	59658daef9	Separate streamable based master node actions (#44313 ) This commit creates new base classes for master node actions whose response types still implement Streamable. This simplifies both finding remaining classes to convert, as well as creating new master node actions that use Writeable for their responses. relates #34389	2019-07-15 09:20:20 -07:00
Hendrik Muhs	684b562381	[7.x][ML-DataFrame] Rewrite continuous logic to prevent terms count limit (#44287 ) Rewrites how continuous data frame transforms calculates and handles buckets that require an update. Instead of storing the whole set in memory, it pages through the updates using a 2nd cursor. This lowers memory consumption and prevents problems with limits at query time (max_terms_count). The list of updates can be re-retrieved in a failure case (#43662)	2019-07-13 06:58:04 +02:00
Benjamin Trent	51ff6b420a	[ML][Data Frame] prevent task from attempting to run when failed (#44239 ) (#44292 )	2019-07-12 15:24:49 -05:00
Benjamin Trent	79c62fd724	[ML][Data Frame] Fixing default delay set in timesync (#44281 ) (#44293 ) * [ML][Data Frame] Fixing default delay set in timesync * disallowing explicit null, don't do duration check on write	2019-07-12 15:21:47 -05:00
Benjamin Trent	68cd675892	[ML][Data Frame] responding with 409 status code when failing _stop (#44231 ) (#44276 ) * [ML][Data Frame] responding with appropriate status code when failing _stop * adding null checks for persistent task data * addressing PR comments	2019-07-12 10:10:24 -05:00
Benjamin Trent	40cc081ad3	[ML][Data Frame] adds index validations to _start data frame transform (#44191 ) (#44227 ) * [ML][Data Frame] adds index validations to _start data frame transform * addressing pr comments	2019-07-11 12:50:50 -05:00
David Roberts	cb62d4acdf	[ML-DataFrame] Add a frequency option to transform config, default 1m (#44120 ) Previously a data frame transform would check whether the source index was changed every 10 seconds. Sometimes it may be desirable for the check to be done less frequently. This commit increases the default to 60 seconds but also allows the frequency to be overridden by a setting in the data frame transform config.	2019-07-10 09:59:00 +01:00
David Kyle	5fc12917c3	Data frame task failure does not make a 500 response (#44058 ) Data frame task responses had logic to return a HTTP 500 status code if there was any node or task failures even if other tasks in the same request reported correctly. This is different to how other task responses are handled where a 200 is always returned leaving the client should check for failures. Returning a 500 also breaks the high level rest client so always return a 200 Closes #44011	2019-07-08 11:53:11 +01:00
Hendrik Muhs	4128b9b4f7	audit message missing for autostop call onStop when auto stopping (#43984) fixes #43977	2019-07-04 21:40:42 +02:00
Benjamin Trent	7063a40411	[7.x] [ML][Data Frame] Adding bwc tests for pivot transform (#43506 ) (#43929 ) * [ML][Data Frame] Adding bwc tests for pivot transform (#43506) * [ML][Data Frame] Adding bwc tests for pivot transform * adding continuous transforms * adding continuous dataframes to bwc * adding continuous data frame tests * Adding rolling upgrade tests for continuous df * Fixing test * Adjusting indices used in BWC, and handling NPE for seq_no_stats * updating and muting specific bwc test * Adjusting bwc tests for backport	2019-07-03 16:39:38 -05:00
Benjamin Trent	fb825a6470	[7.x] [ML][Data Frame] add node attr to GET _stats (#43842 ) (#43894 ) * [ML][Data Frame] add node attr to GET _stats (#43842) * [ML][Data Frame] add node attr to GET _stats * addressing testing issues with node.attributes * adjusting for backport	2019-07-02 19:35:37 -05:00
Benjamin Trent	2c97e26ce8	[ML][Data Frame] fix progress measurement for continuous transforms (#43838 ) (#43887 ) * [ML][Data Frame] fix progress measurement for continuous transforms * Update DataFrameIndexer.java	2019-07-02 19:35:09 -05:00
Benjamin Trent	b95ee7ebb2	[7.x] [ML][Data Frame] using transform creation version for node assignment (#43764 ) (#43843 ) * [ML][Data Frame] using transform creation version for node assignment (#43764) * [ML][Data Frame] using transform creation version for node assignment * removing unused imports * Addressing PR comment * adjusing for backport	2019-07-02 06:52:34 -05:00
Benjamin Trent	82c1ddc117	[7.x] [ML][Data Frame] Add deduced mappings to _preview response payload (#43742 ) (#43849 ) * [ML][Data Frame] Add deduced mappings to _preview response payload (#43742) * [ML][Data Frame] Add deduced mappings to _preview response payload * updating preview docs * fixing code for backport	2019-07-02 06:52:14 -05:00
Benjamin Trent	8108834534	[ML][Data Frame] account for delay in writing stats docs (#43703 ) (#43819 )	2019-07-01 09:14:44 -05:00
Benjamin Trent	4c95c0c456	[ML][Data Frame] reduce audit frequency, change log msg, and level (#43771 ) (#43818 )	2019-07-01 09:14:26 -05:00
Ryan Ernst	3a2c698ce0	Rename Action to ActionType (#43778 ) Action is a class that encapsulates meta information about an action that allows it to be called remotely, specifically the action name and response type. With recent refactoring, the action class can now be constructed as a static constant, instead of needing to create a subclass. This makes the old pattern of creating a singleton INSTANCE both misnamed and lacking a common placement. This commit renames Action to ActionType, thus allowing the old INSTANCE naming pattern to be TYPE on the transport action itself. ActionType also conveys that this class is also not the action itself, although this change does not rename any concrete classes as those will be removed organically as they are converted to TYPE constants. relates #34389	2019-06-30 22:00:17 -07:00
Benjamin Trent	67a3c656c3	[7.x] [ML][Data Frame] removing format support (#43659 ) (#43747 ) * [ML][Data Frame] removing format support (#43659) * Fixing conflicts	2019-06-28 10:02:37 -05:00
Przemysław Witek	ba518722a2	[7.x] [ML] Tag destination index with data frame metadata (#43567 ) (#43660 )	2019-06-27 08:08:39 +02:00
Benjamin Trent	d05593c3ad	[ML][Data Frame] adds tests for continuous DF (#43601 ) (#43654 )	2019-06-26 14:59:19 -05:00
David Kyle	e1f761dfc7	[Ml Data Frame] Size the GET stats search by number of Ids requested (#43206 ) Set the size of the search request to the number of ids limited by 10,000	2019-06-26 17:01:12 +01:00
Benjamin Trent	c121b00c98	[7.x] [ML][Data Frame] Add support for allow_no_match for endpoints (#43490 ) (#43637 ) * [ML][Data Frame] Add support for allow_no_match for endpoints (#43490) * [ML][Data Frame] Add support for allow_no_match parameter in endpoints Adds support for: * Get Transforms * Get Transforms stats * stop transforms * Update DataFrameTransformDocumentationIT.java	2019-06-26 10:09:56 -05:00
Dimitris Athanasiou	126c2fd2d5	[7.x][ML] Machine learning data frame analytics (#43544 ) (#43592 ) This merges the initial work that adds a framework for performing machine learning analytics on data frames. The feature is currently experimental and requires a platinum license. Note that the original commits can be found in the `feature-ml-data-frame-analytics` branch. A new set of APIs is added which allows the creation of data frame analytics jobs. Configuration allows specifying different types of analysis to be performed on a data frame. At first there is support for outlier detection. The APIs are: - PUT _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id} - GET _ml/data_frame/analysis/{id}/_stats - POST _ml/data_frame/analysis/{id}/_start - POST _ml/data_frame/analysis/{id}/_stop - DELETE _ml/data_frame/analysis/{id} When a data frame analytics job is started a persistent task is created and started. The main steps of the task are: 1. reindex the source index into the dest index 2. analyze the data through the data_frame_analyzer c++ process 3. merge the results of the process back into the destination index In addition, an evaluation API is added which packages commonly used metrics that provide evaluation of various analysis: - POST _ml/data_frame/_evaluate	2019-06-25 20:29:11 +03:00
Benjamin Trent	970e157eac	[ML][Data Frame] Adjusting error message (#43455 ) (#43580 ) * Adjusting error message * Update TransportPutDataFrameTransformAction.java * Update TransportPutDataFrameTransformAction.java	2019-06-25 10:09:39 -05:00
Martijn van Groningen	101cf384ba	Replace Streamable w/ Writable in AcknowledgedResponse and subclasses (backport 7.x) (#43525 ) This commit replaces usages of Streamable with Writeable for the AcknowledgedResponse and its subclasses, plus associated actions. Note that where possible response fields were made final and default constructors were removed. This is a large PR, but the change is mostly mechanical. Relates to #34389 Backport of #43414	2019-06-24 13:47:37 +02:00
Benjamin Trent	f4b75d6d14	[7.x] [ML][Data Frame] Add version and create_time to transform config (#43384 ) (#43480 ) * [ML][Data Frame] Add version and create_time to transform config (#43384) * [ML][Data Frame] Add version and create_time to transform config * s/transform_version/version s/Date/Instant * fixing getter/setter for version * adjusting for backport	2019-06-21 09:11:44 -05:00
Benjamin Trent	77ce3260dd	[ML][Data Frame] make response.count be total count of hits (#43241 ) (#43389 ) * [ML][Data Frame] make response.count be total count of hits * addressing line length check * changing response count for filters * adjusting serialization, variable name, and total count logic * making count mandatory for creation	2019-06-19 16:19:06 -05:00
Benjamin Trent	b333ced5a7	[7.x] [ML][Data Frame] adds new pipeline field to dest config (#43124 ) (#43388 ) * [ML][Data Frame] adds new pipeline field to dest config (#43124) * [ML][Data Frame] adds new pipeline field to dest config * Adding pipeline support to _preview * removing unused import * moving towards extracting _source from pipeline simulation * fixing permission requirement, adding _index entry to doc * adjusting for java 8 compatibility * adjusting bwc serialization version to 7.3.0	2019-06-19 16:18:27 -05:00
Benjamin Trent	365f87c622	[ML][Data Frame] only complete task after state persistence (#43230 ) (#43294 ) * [ML][Data Frame] only complete task after state persistence There is a race condition where the task could be completed, but there is still a pending document write. This change moves the task cancellation into the actionlistener of the state persistence. intermediate commit intermediate commit * removing unused import * removing unused const * refreshing internal index after waiting for task to complete * adjusting test data generation	2019-06-17 16:49:00 -05:00

1 2 3 4

158 Commits