OpenSearch

Commit Graph

Author	SHA1	Message	Date
Benjamin Trent	b333ced5a7	[7.x] [ML][Data Frame] adds new pipeline field to dest config (#43124 ) (#43388 ) * [ML][Data Frame] adds new pipeline field to dest config (#43124) * [ML][Data Frame] adds new pipeline field to dest config * Adding pipeline support to _preview * removing unused import * moving towards extracting _source from pipeline simulation * fixing permission requirement, adding _index entry to doc * adjusting for java 8 compatibility * adjusting bwc serialization version to 7.3.0	2019-06-19 16:18:27 -05:00
Alpar Torok	cce5b0f018	Convert dataframes to use testclusters (#43032 )	2019-06-14 11:02:39 +03:00
Benjamin Trent	aff4795441	[ML][Data Frame] cleaning up tests since tasks are cancelled onfinish (#43136 ) (#43166 ) * [ML][Data Frame] cleaning up usage test since tasks are cancelled onfinish * Update DataFrameUsageIT.java * Fixing additional test, waiting for task to complete * removing unused import * unmuting test	2019-06-12 14:39:38 -05:00
David Kyle	597ae5c7b8	[ML DataFrame] Reject Data Frame Ids containing upper case characters (#43145 )	2019-06-12 18:13:18 +01:00
Benjamin Trent	e384bf0276	[ML-DataFrame] stop task at completion of data frame function (#42955 ) (#43114 ) * stop data frame task after it finishes * test auto stop * adapt tests * persist the state correctly and move stop into listener * Calling `onStop` even if persistence fails, changing `stop` to rely on doSaveState	2019-06-11 15:55:02 -05:00
Benjamin Trent	755ba72896	[ML][Data frame] make sure that fields exist when creating progress (#42943 ) (#42984 )	2019-06-07 10:13:18 -05:00
Mark Vieira	e44b8b1e2e	[Backport] Remove dependency substitutions 7.x (#42866 ) * Remove unnecessary usage of Gradle dependency substitution rules (#42773) (cherry picked from commit 12d583dbf6f7d44f00aa365e34fc7e937c3c61f7)	2019-06-04 13:50:23 -07:00
Benjamin Trent	0253927ec4	[ML Data Frame] Refactor stop logic (#42644 ) (#42763 ) * Revert "invalid test" This reverts commit 9dd8b52c13c716918ff97e6527aaf43aefc4695d. * Testing * mend * Revert "[ML Data Frame] Mute Data Frame tests" This reverts commit 5d837fa312b0e41a77a65462667a2d92d1114567. * Call onStop and onAbort outside atomic update * Don’t update CS * Tidying up * Remove invalid test that asserted logic that has been removed * Add stopped event * Revert "Add stopped event" This reverts commit 02ba992f4818bebd838e1c7678bd2e1cc090bfab. * Adding check for STOPPED in saveState	2019-06-03 06:53:44 -05:00
Benjamin Trent	f22dcfb9da	[ML] [Data Frame] nesting group_by fields like other aggs (#42718 ) (#42760 )	2019-05-31 10:55:35 -05:00
Benjamin Trent	b5527b3278	[ML] [Data Frame] add support for weighted_avg agg (#42646 ) (#42714 )	2019-05-30 12:05:35 -05:00
Hendrik Muhs	345ff21ae5	[ML-DataFrame] rewrite start and stop to answer with acknowledged (#42589 ) rewrite start and stop to answer with acknowledged fixes #42450	2019-05-29 11:14:32 +02:00
Hendrik Muhs	6d47ee9268	[ML-DataFrame] add support for fixed_interval, calendar_interval, remove interval (#42427 ) * add support for fixed_interval, calendar_interval, remove interval * adapt HLRC * checkstyle * add a hlrc to server test * adapt yml test * improve naming and doc * improve interface and add test code for hlrc to server * address review comments * repair merge conflict * fix date patterns * address review comments * remove assert for warning * improve exception message * use constants	2019-05-24 20:30:17 +02:00
Hendrik Muhs	7cee294acf	[ML-DataFrame]backport dataframe changes from 42202, using client instead of transport (#42468 ) backport dataframe changes from #42202, using client instead of transport	2019-05-24 11:05:30 +02:00
David Kyle	f696769a39	Mute Data Frame integration tests Relates to https://github.com/elastic/elasticsearch/issues/42344	2019-05-22 15:03:13 +01:00
David Kyle	7e4d3c695b	[ML Data Frame] Persist and restore checkpoint and position (#41942 ) Persist and restore Data frame's current checkpoint and position	2019-05-21 18:57:13 +01:00
David Kyle	24144aead2	[ML] Complete the Data Frame task on stop (#41752 ) (#42063 ) Wait for indexer to stop then complete the persistent task on stop. If the wait_for_completion is true the request will not return until stopped.	2019-05-21 10:24:20 +01:00
Zachary Tong	6ae6f57d39	[7.x Backport] Force selection of calendar or fixed intervals (#41906 ) The date_histogram accepts an interval which can be either a calendar interval (DST-aware, leap seconds, arbitrary length of months, etc) or fixed interval (strict multiples of SI units). Unfortunately this is inferred by first trying to parse as a calendar interval, then falling back to fixed if that fails. This leads to confusing arrangement where `1d` == calendar, but `2d` == fixed. And if you want a day of fixed time, you have to specify `24h` (e.g. the next smallest unit). This arrangement is very error-prone for users. This PR adds `calendar_interval` and `fixed_interval` parameters to any code that uses intervals (date_histogram, rollup, composite, datafeed, etc). Calendar only accepts calendar intervals, fixed accepts any combination of units (meaning `1d` can be used to specify `24h` in fixed time), and both are mutually exclusive. The old interval behavior is deprecated and will throw a deprecation warning. It is also mutually exclusive with the two new parameters. In the future the old dual-purpose interval will be removed. The change applies to both REST and java clients.	2019-05-20 12:07:29 -04:00
Benjamin Trent	f2447364fd	[ML] adds geo_centroid aggregation support to data frames (#42088 ) (#42094 )	2019-05-17 16:51:05 -04:00
Benjamin Trent	febee07dcc	[ML] adding pivot.max_search_page_size option for setting paging size (#41920 ) (#42079 ) * [ML] adding pivot.size option for setting paging size * Changing field name to address PR comments * fixing ctor usage * adjust hlrc for field name change	2019-05-10 13:22:31 -05:00
Hendrik Muhs	0c03707704	[ML-DataFrame] reset/clear the position after indexer is done (#41736 ) reset/clear the position after indexer is done	2019-05-06 09:41:51 +02:00
Benjamin Trent	a70f796edd	[ML] fix array oob in IDGenerator and adjust format for mapping (#41703 ) (#41717 ) * [ML] fix array oob in IDGenerator and adjust format for mapping * Update DataFramePivotRestIT.java	2019-05-02 11:09:42 -05:00
Benjamin Trent	92a820bc1a	[ML] Add bucket_script agg support to data frames (#41594 ) (#41639 )	2019-04-29 10:14:17 -05:00
Benjamin Trent	a0990ca239	[ML] cleanup + adding description field to transforms (#41554 ) (#41605 ) * [ML] cleanup + adding description field to transforms * making description length have a max of 1k	2019-04-26 16:50:59 -05:00
Benjamin Trent	08843ba62b	[ML] Adds progress reporting for transforms (#41278 ) (#41529 ) * [ML] Adds progress reporting for transforms * fixing after master merge * Addressing PR comments * removing unused imports * Adjusting afterKey handling and percentage to be 100* * Making sure it is a linked hashmap for serialization * removing unused import * addressing PR comments * removing unused import * simplifying code, only storing total docs and decrementing * adjusting for rewrite * removing initial progress gathering from executor	2019-04-25 11:23:12 -05:00
Benjamin Trent	e2f8ffdde8	[ML][Data Frame] Moving destination creation to _start (#41416 ) (#41433 ) * [ML][Data Frame] Moving destination creation to _start * slight refactor of DataFrameAuditor constructor	2019-04-23 09:32:57 -05:00
Hendrik Muhs	02247cc7df	[ML-DataFrame] adapt page size on circuit breaker responses (#41149 ) handle circuit breaker response and adapt page size to reduce memory pressure, reduce preview buckets to 100, initial page size to 500	2019-04-16 19:49:43 +02:00
Benjamin Trent	9e32e36799	[ML] fixing test related to #40963 (#41074 ) (#41116 )	2019-04-11 11:19:56 -05:00
Hendrik Muhs	f9018ab11b	[ML-DataFrame] create checkpoints on every new run (#40725 ) Use the checkpoint service to create a checkpoint on every new run. Expose checkpoints stats on _stats endpoint.	2019-04-10 09:14:11 +02:00
Julie Tibshirani	0702c72151	Mute DataFrameGetAndGetStatsIT#testGetPersistedStatsWithoutTask. Tracked in #40963.	2019-04-09 16:39:16 -07:00
Benjamin Trent	a8dbb07546	[ML] Changes default destination index field mapping and adds scripted_metric agg (#40750 ) (#40846 ) * [ML] Allowing destination index mappings to have dynamic types, adds script_metric agg * Making dynamic\|source mapping explicit	2019-04-05 11:34:20 -05:00
Benjamin Trent	945e7ca01e	[ML] Periodically persist data-frame running statistics to internal index (#40650 ) (#40729 ) * [ML] Add mappings, serialization, and hooks to persist stats * Adding tests for transforms without tasks having stats persisted * intermittent commit * Adjusting usage stats to account for stored stats docs * Adding tests for id expander * Addressing PR comments * removing unused import * adding shard failures to the task response	2019-04-02 14:16:55 -05:00
Benjamin Trent	29180cefac	[ML] fix test check as randomness allows for different hours (#40536 ) (#40727 ) * [ML] fix test check as randomness allows for different hours * Re-enabling test	2019-04-02 12:33:35 -05:00
Benjamin Trent	655e3d8f75	[ML] fix test, should account for async nature of audit (#40637 ) (#40683 )	2019-04-01 10:00:32 -05:00
Luca Cavanna	873c5638e6	Mute DataFrameAuditorIT#testAuditorWritesAudits Relates to #40594	2019-03-28 16:53:54 +01:00
David Kyle	13d4d73ce3	Mute DataFrameTaskFailedStateIT.testFailureStateInteraction (#40544 )	2019-03-27 17:39:44 +00:00
Benjamin Trent	be67752c34	Muting test related to #40537 (#40539 )	2019-03-27 11:47:09 -05:00
Benjamin Trent	12943c5d2c	[ML] Add data frame task state object and field (#40169 ) (#40490 ) * [ML] Add data frame task state object and field * A new state item is added so that the overall task state can be accoutned for * A new FAILED state and reason have been added as well so that failures can be shown to the user for optional correction * Addressing PR comments * adjusting after master merge * addressing pr comment * Adjusting auditor usage with failure state * Refactor, renamed state items to task_state and indexer_state * Adding todo and removing redundant auditor call * Address HLRC changes and PR comment * adjusting hlrc IT test	2019-03-27 06:53:58 -05:00
Benjamin Trent	7b4f964708	[ML] make source and dest objects in the transform config (#40337 ) (#40396 ) * [ML] make source and dest objects in the transform config * addressing PR comments * Fixing compilation post merge * adding comment for Arrays.hashCode * addressing changes for moving dest to object * fixing data_frame yml tests * fixing API test	2019-03-25 07:16:41 -05:00
Hendrik Muhs	38afc9f27d	refresh audit index before searching (#40401 ) refresh the audit index before searching	2019-03-25 11:57:57 +01:00
Benjamin Trent	a30bf27b2f	[ML] add auditor to data frame plugin (#40012 ) (#40394 ) * [Data Frame] add auditor * Adjusting Level, Auditor, and message to address pr comments * Addressing PR comments	2019-03-23 18:56:44 -05:00
Benjamin Trent	2dd879abac	[ML] adds support for non-numeric mapped types (#40220 ) (#40380 ) * [ML] adds support for non-numeric mapped types and mapping overrides * correcting hlrc compilation issues after merge * removing mapping_override option * clearing up unnecessary changes	2019-03-23 14:04:14 -05:00
Hendrik Muhs	5a0c32833e	Add a checkpoint service for data frame transforms (#39836 ) Add a checkpoint service for data frame transforms, which allows to ask for a checkpoint of the source. In future these checkpoints will be stored in the internal index to - detect upstream changes - updating the data frame without a full re-run - allow data frame clients to checkpoint themselves	2019-03-22 10:25:30 +01:00
Benjamin Trent	8c6ff5de31	[Data Frame] Refactor PUT transform to not create a task (#39934 ) (#40010 ) * [Data Frame] Refactor PUT transform such that: * POST _start creates the task and starts it * GET transforms queries docs instead of tasks * POST _stop verifies the stored config exists before trying to stop the task * Addressing PR comments * Refactoring DataFrameFeatureSet#usage, decreasing size returned getTransformConfigurations * fixing failing usage test	2019-03-13 17:08:15 -05:00
David Roberts	e94d32d069	Add roles and cluster privileges for data frame transforms (#39661 ) This change adds two new cluster privileges: * manage_data_frame_transforms * monitor_data_frame_transforms And two new built-in roles: * data_frame_transforms_admin * data_frame_transforms_user These permit access to the data frame transform endpoints. (Index privileges are also required on the source and destination indices for each data frame transform, but since these indices are configurable they it is not appropriate to grant them via built-in roles.)	2019-03-05 14:07:25 +00:00
David Kyle	894ecb244d	[ML-Dataframe] Move dataframe actions into core (#39548 )	2019-03-01 10:45:36 +00:00
Hendrik Muhs	30e5c11cc2	[ML-DataFrame] Dataframe REST cleanups (#39451 ) (#39503 ) fix a couple of odd behaviors of data frame transforms REST API's: - check if id from body and id from URL match if both are specified - do not allow a body for delete - allow get and stats without specifying an id	2019-02-28 13:00:37 +01:00
Benjamin Trent	3262d6c917	[ML-DataFrame] Add _preview endpoint (#38924 ) (#39319 ) * [DATA-FRAME] add preview endpoint * adjusting preview tests and fixing parser * adjusing preview transport * remove unused import * adjusting test * Addressing PR comments * Fixing failing test and adjusting for pr comments * fixing integration test	2019-02-22 10:55:38 -06:00
Hendrik Muhs	4f662bd289	Add data frame feature (#38934 ) (#39029 ) The data frame plugin allows users to create feature indexes by pivoting a source index. In a nutshell this can be understood as reindex supporting aggregations or similar to the so called entity centric indexing. Full history is provided in: feature/data-frame-transforms	2019-02-18 11:07:29 +01:00

48 Commits