OpenSearch

Commit Graph

Author	SHA1	Message	Date
Zachary Tong	6ae6f57d39	[7.x Backport] Force selection of calendar or fixed intervals (#41906 ) The date_histogram accepts an interval which can be either a calendar interval (DST-aware, leap seconds, arbitrary length of months, etc) or fixed interval (strict multiples of SI units). Unfortunately this is inferred by first trying to parse as a calendar interval, then falling back to fixed if that fails. This leads to confusing arrangement where `1d` == calendar, but `2d` == fixed. And if you want a day of fixed time, you have to specify `24h` (e.g. the next smallest unit). This arrangement is very error-prone for users. This PR adds `calendar_interval` and `fixed_interval` parameters to any code that uses intervals (date_histogram, rollup, composite, datafeed, etc). Calendar only accepts calendar intervals, fixed accepts any combination of units (meaning `1d` can be used to specify `24h` in fixed time), and both are mutually exclusive. The old interval behavior is deprecated and will throw a deprecation warning. It is also mutually exclusive with the two new parameters. In the future the old dual-purpose interval will be removed. The change applies to both REST and java clients.	2019-05-20 12:07:29 -04:00
Benjamin Trent	f2447364fd	[ML] adds geo_centroid aggregation support to data frames (#42088 ) (#42094 )	2019-05-17 16:51:05 -04:00
Benjamin Trent	febee07dcc	[ML] adding pivot.max_search_page_size option for setting paging size (#41920 ) (#42079 ) * [ML] adding pivot.size option for setting paging size * Changing field name to address PR comments * fixing ctor usage * adjust hlrc for field name change	2019-05-10 13:22:31 -05:00
Benjamin Trent	0931815355	[ML] properly nesting objects in document source (#41901 ) (#42077 ) * [ML] properly nesting objects in document source * Throw exception on agg extraction failure, cause it to fail df * throwing error to stop df if unsupported agg is found	2019-05-10 13:22:12 -05:00
David Kyle	ba9d2ccc1f	[ML Data Frame] Set executing nodes in task actions (#41798 ) Direct the task request to the node executing the task and also refactor the task responses so all errors are returned and set the HTTP status code based on presence of errors.	2019-05-08 12:25:36 +01:00
Ryan Ernst	6fd8924c5a	Switch run task to use real distro (#41590 ) The run task is supposed to run elasticsearch with the given plugin or module. However, for modules, this is most realistic if using the full distribution. This commit changes the run setup to use the default or oss as appropriate.	2019-05-06 12:34:07 -07:00
Benjamin Trent	50fc27e9a0	[ML] addresses preview bug, and adds check to PUT (#41803 ) (#41850 )	2019-05-06 10:56:26 -05:00
Hendrik Muhs	d54a921032	remove unused import	2019-05-06 10:14:35 +02:00
Hendrik Muhs	0c03707704	[ML-DataFrame] reset/clear the position after indexer is done (#41736 ) reset/clear the position after indexer is done	2019-05-06 09:41:51 +02:00
Benjamin Trent	b69e28177b	[ML] rewriting stats gathering to use callbacks instead of a latch (#41793 ) (#41804 )	2019-05-03 18:18:27 -05:00
Hendrik Muhs	00af42fefe	move checkpoints into x-pack core and introduce base classes for data frame tests (#41783 ) move checkpoints into x-pack core and introduce base classes for data frame tests	2019-05-03 14:16:25 +02:00
Hendrik Muhs	befe2a45b9	[ML-DataFrame] refactor pivot to only take the pivot config (#41763 ) refactor pivot class to only take the config at construction, other parameters are passed in as part of method that require them	2019-05-03 13:37:51 +02:00
Benjamin Trent	33b4032fab	[ML] Correct indexer state on task re-allocation (#41724 ) (#41751 )	2019-05-02 12:01:59 -05:00
Benjamin Trent	a70f796edd	[ML] fix array oob in IDGenerator and adjust format for mapping (#41703 ) (#41717 ) * [ML] fix array oob in IDGenerator and adjust format for mapping * Update DataFramePivotRestIT.java	2019-05-02 11:09:42 -05:00
Hendrik Muhs	be7ec5a47a	simplify indexer by moving members to base class (#41741 ) simplify indexer by moving members to base class	2019-05-02 16:08:08 +02:00
Benjamin Trent	92a820bc1a	[ML] Add bucket_script agg support to data frames (#41594 ) (#41639 )	2019-04-29 10:14:17 -05:00
Benjamin Trent	a0990ca239	[ML] cleanup + adding description field to transforms (#41554 ) (#41605 ) * [ML] cleanup + adding description field to transforms * making description length have a max of 1k	2019-04-26 16:50:59 -05:00
Benjamin Trent	3ccb48e516	[ML] data frame, verify primary shards are active for configs index before task start (#41551 ) (#41580 )	2019-04-26 10:23:43 -05:00
Benjamin Trent	4836ff7bcd	[ML] add multi node integ tests for data frames (#41508 ) (#41552 ) * [ML] adding native-multi-node-integTests for data frames' * addressing streaming issues * formatting fixes * Addressing PR comments	2019-04-26 07:18:49 -05:00
Benjamin Trent	08843ba62b	[ML] Adds progress reporting for transforms (#41278 ) (#41529 ) * [ML] Adds progress reporting for transforms * fixing after master merge * Addressing PR comments * removing unused imports * Adjusting afterKey handling and percentage to be 100* * Making sure it is a linked hashmap for serialization * removing unused import * addressing PR comments * removing unused import * simplifying code, only storing total docs and decrementing * adjusting for rewrite * removing initial progress gathering from executor	2019-04-25 11:23:12 -05:00
Hendrik Muhs	13d720f4a7	[ML-DataFrame] Replace Streamable w/ Writeable (#41477 ) replace usages of Streamable with Writeable Relates to #36176	2019-04-24 17:17:34 +02:00
Benjamin Trent	e2f8ffdde8	[ML][Data Frame] Moving destination creation to _start (#41416 ) (#41433 ) * [ML][Data Frame] Moving destination creation to _start * slight refactor of DataFrameAuditor constructor	2019-04-23 09:32:57 -05:00
David Kyle	1d2365f5b6	[ML-DataFrame] Refactorings and tidying (#41248 ) Remove unnecessary generic params from SingleGroupSource and unused code from the HLRC	2019-04-17 14:58:26 +01:00
David Kyle	711d2545aa	[ML-DataFrame] Resolve random test failure using deterministic name (#41262 )	2019-04-17 09:04:20 +01:00
Hendrik Muhs	02247cc7df	[ML-DataFrame] adapt page size on circuit breaker responses (#41149 ) handle circuit breaker response and adapt page size to reduce memory pressure, reduce preview buckets to 100, initial page size to 500	2019-04-16 19:49:43 +02:00
David Kyle	2b539f8347	[ML DataFrame] Data Frame stop all (#41156 ) Wild card support for the data frame stop API	2019-04-15 15:04:28 +01:00
Benjamin Trent	9e32e36799	[ML] fixing test related to #40963 (#41074 ) (#41116 )	2019-04-11 11:19:56 -05:00
Christoph Büscher	d00d3f4afa	Mute DataFrameTransformCheckpointTests#testGetBehind	2019-04-10 15:55:09 +02:00
Hendrik Muhs	f9018ab11b	[ML-DataFrame] create checkpoints on every new run (#40725 ) Use the checkpoint service to create a checkpoint on every new run. Expose checkpoints stats on _stats endpoint.	2019-04-10 09:14:11 +02:00
Julie Tibshirani	0702c72151	Mute DataFrameGetAndGetStatsIT#testGetPersistedStatsWithoutTask. Tracked in #40963.	2019-04-09 16:39:16 -07:00
Hendrik Muhs	d5fcbf2f4a	refactor onStart and onFinish to take runnables and executed them guarded by state (#40855 ) refactor onStart and onFinish to take action listeners and execute them when indexer is in indexing state.	2019-04-07 21:46:37 +02:00
Benjamin Trent	a8dbb07546	[ML] Changes default destination index field mapping and adds scripted_metric agg (#40750 ) (#40846 ) * [ML] Allowing destination index mappings to have dynamic types, adds script_metric agg * Making dynamic\|source mapping explicit	2019-04-05 11:34:20 -05:00
Benjamin Trent	665f0d81aa	[ML] refactoring start task a bit, removing unused code (#40798 ) (#40845 )	2019-04-05 09:01:01 -05:00
Hendrik Muhs	31e79a73d7	add HLRC protocol tests for transform state and stats (#40766 ) adds HLRC protocol tests for state and stats hrlc clients	2019-04-03 12:51:15 +02:00
Benjamin Trent	945e7ca01e	[ML] Periodically persist data-frame running statistics to internal index (#40650 ) (#40729 ) * [ML] Add mappings, serialization, and hooks to persist stats * Adding tests for transforms without tasks having stats persisted * intermittent commit * Adjusting usage stats to account for stored stats docs * Adding tests for id expander * Addressing PR comments * removing unused import * adding shard failures to the task response	2019-04-02 14:16:55 -05:00
Benjamin Trent	4842d7fb7d	[ML] addressing test failure (#40701 ) (#40728 ) * [ML] Fixing test * adjusting line lengths * marking valid seqno as final	2019-04-02 12:33:51 -05:00
Benjamin Trent	29180cefac	[ML] fix test check as randomness allows for different hours (#40536 ) (#40727 ) * [ML] fix test check as randomness allows for different hours * Re-enabling test	2019-04-02 12:33:35 -05:00
Benjamin Trent	655e3d8f75	[ML] fix test, should account for async nature of audit (#40637 ) (#40683 )	2019-04-01 10:00:32 -05:00
Luca Cavanna	873c5638e6	Mute DataFrameAuditorIT#testAuditorWritesAudits Relates to #40594	2019-03-28 16:53:54 +01:00
David Kyle	6ef657c5ad	[ML] Data Frame minor tidy ups (#40580 ) Remove Xlint-rawtypes option and remove unused request builders. Not all requests need to implement ToXContent.	2019-03-28 12:27:46 +00:00
David Kyle	13d4d73ce3	Mute DataFrameTaskFailedStateIT.testFailureStateInteraction (#40544 )	2019-03-27 17:39:44 +00:00
Benjamin Trent	95a0c524a1	Muting test #40368 (#40542 )	2019-03-27 12:11:10 -05:00
Benjamin Trent	be67752c34	Muting test related to #40537 (#40539 )	2019-03-27 11:47:09 -05:00
David Kyle	61845dd38b	[ML] Fix serialisation of Start Data Frame request (#40483 )	2019-03-27 12:56:34 +00:00
Benjamin Trent	12943c5d2c	[ML] Add data frame task state object and field (#40169 ) (#40490 ) * [ML] Add data frame task state object and field * A new state item is added so that the overall task state can be accoutned for * A new FAILED state and reason have been added as well so that failures can be shown to the user for optional correction * Addressing PR comments * adjusting after master merge * addressing pr comment * Adjusting auditor usage with failure state * Refactor, renamed state items to task_state and indexer_state * Adding todo and removing redundant auditor call * Address HLRC changes and PR comment * adjusting hlrc IT test	2019-03-27 06:53:58 -05:00
Hendrik Muhs	f4e56118c2	[ML] generate unique doc ids for data frame (#40382 ) create and use unique, deterministic document ids based on the grouping values. This is a pre-requisite for updating documents as well as preventing duplicates after a hard failure during indexing.	2019-03-27 08:27:05 +01:00
Benjamin Trent	7b4f964708	[ML] make source and dest objects in the transform config (#40337 ) (#40396 ) * [ML] make source and dest objects in the transform config * addressing PR comments * Fixing compilation post merge * adding comment for Arrays.hashCode * addressing changes for moving dest to object * fixing data_frame yml tests * fixing API test	2019-03-25 07:16:41 -05:00
Hendrik Muhs	38afc9f27d	refresh audit index before searching (#40401 ) refresh the audit index before searching	2019-03-25 11:57:57 +01:00
Benjamin Trent	a30bf27b2f	[ML] add auditor to data frame plugin (#40012 ) (#40394 ) * [Data Frame] add auditor * Adjusting Level, Auditor, and message to address pr comments * Addressing PR comments	2019-03-23 18:56:44 -05:00
Benjamin Trent	2dd879abac	[ML] adds support for non-numeric mapped types (#40220 ) (#40380 ) * [ML] adds support for non-numeric mapped types and mapping overrides * correcting hlrc compilation issues after merge * removing mapping_override option * clearing up unnecessary changes	2019-03-23 14:04:14 -05:00
Benjamin Trent	88f510ffc2	[ML] making test more determinate (#40374 ) (#40381 ) * [ML] making test more determinate * unmuting test	2019-03-23 12:15:37 -05:00
Benjamin Trent	05460cca58	Muting test testExtractIndexCheckpointsInconsistentGlobalCheckpoints (#40370 )	2019-03-22 13:25:48 -05:00
Hendrik Muhs	5a0c32833e	Add a checkpoint service for data frame transforms (#39836 ) Add a checkpoint service for data frame transforms, which allows to ask for a checkpoint of the source. In future these checkpoints will be stored in the internal index to - detect upstream changes - updating the data frame without a full re-run - allow data frame clients to checkpoint themselves	2019-03-22 10:25:30 +01:00
David Kyle	a4cb92a300	[ML] Data Frame HLRC Preview API (#40258 )	2019-03-21 09:38:27 +00:00
Benjamin Trent	5ae43855fc	[ML] Refactor GET Transforms API (#40015 ) (#40269 ) * [Data Frame] Refactor GET Transforms API: * Add pagination * comma delimited list expression support GET transforms * Flag troublesome internal code for future refactor * Removing `allow_no_transforms` param, ratcheting down pageparam option * Changing DataFrameFeatureSet#usage to not get all configs * Intermediate commit * Writing test for batch data gatherer * Removing unused import * removing bad println used for debugging * Updating BatchedDataIterator comments and query * addressing pr comments * disallow null scrollId to cause stackoverflow	2019-03-20 19:14:50 -05:00
David Kyle	387648065d	[ML] Data Frame HLRC start & stop APIs (#40197 )	2019-03-19 13:30:01 +00:00
Benjamin Trent	28729eb54c	[ML] fixing sort order (#40119 ) (#40123 )	2019-03-16 17:14:07 -05:00
David Kyle	09809bc91b	[ML] Avoid assertions on empty Optional in DF usage test (#40043 ) Refactor the usage class to make testing simpler	2019-03-15 12:18:29 +00:00
Yogesh Gaikwad	59201915db	Mute DataFrameFeatureSetTests#testUsage test (#40023 )	2019-03-14 10:39:14 +00:00
Benjamin Trent	8c6ff5de31	[Data Frame] Refactor PUT transform to not create a task (#39934 ) (#40010 ) * [Data Frame] Refactor PUT transform such that: * POST _start creates the task and starts it * GET transforms queries docs instead of tasks * POST _stop verifies the stored config exists before trying to stop the task * Addressing PR comments * Refactoring DataFrameFeatureSet#usage, decreasing size returned getTransformConfigurations * fixing failing usage test	2019-03-13 17:08:15 -05:00
Hendrik Muhs	d30848eb23	change internal index to index doc_type, id, source and dest (#39913 ) change internal index to index doc_type, id, source and dest	2019-03-11 17:35:34 +01:00
Hendrik Muhs	50d742320d	store the doc type in the internal index (#39824 ) store the doc type in the internal data frame index	2019-03-08 12:17:23 +01:00
David Roberts	e94d32d069	Add roles and cluster privileges for data frame transforms (#39661 ) This change adds two new cluster privileges: * manage_data_frame_transforms * monitor_data_frame_transforms And two new built-in roles: * data_frame_transforms_admin * data_frame_transforms_user These permit access to the data frame transform endpoints. (Index privileges are also required on the source and destination indices for each data frame transform, but since these indices are configurable they it is not appropriate to grant them via built-in roles.)	2019-03-05 14:07:25 +00:00
David Kyle	c7a2910cc1	[Ml-Dataframe] Register Data Frame named writables and xcontents (#39635 ) Register types in the Dataframe plugin	2019-03-04 11:48:03 +00:00
David Kyle	894ecb244d	[ML-Dataframe] Move dataframe actions into core (#39548 )	2019-03-01 10:45:36 +00:00
Hendrik Muhs	30e5c11cc2	[ML-DataFrame] Dataframe REST cleanups (#39451 ) (#39503 ) fix a couple of odd behaviors of data frame transforms REST API's: - check if id from body and id from URL match if both are specified - do not allow a body for delete - allow get and stats without specifying an id	2019-02-28 13:00:37 +01:00
Benjamin Trent	926291aac8	[DATA-FRAME] Sort `GET` transforms and stats by ID (#39365 ) (#39369 ) * [Data-Frame] Sort `GET` transforms and stats by ID * removing unused import	2019-02-25 14:22:41 -06:00
Hendrik Muhs	1897883adc	[ML-DataFrame] Dataframe access headers (#39289 ) (#39368 ) store user headers as part of the config and run transform as user	2019-02-25 19:08:26 +01:00
Benjamin Trent	3262d6c917	[ML-DataFrame] Add _preview endpoint (#38924 ) (#39319 ) * [DATA-FRAME] add preview endpoint * adjusting preview tests and fixing parser * adjusing preview transport * remove unused import * adjusting test * Addressing PR comments * Fixing failing test and adjusting for pr comments * fixing integration test	2019-02-22 10:55:38 -06:00
Jay Modi	697911c31d	Fixed missed stopping of SchedulerEngine (#39193 ) The SchedulerEngine is used in several places in our code and not all of these usages properly stopped the SchedulerEngine, which could lead to test failures due to leaked threads from the SchedulerEngine. This change adds stopping to these usages in order to avoid the thread leaks that cause CI failures and noise. Closes #38875	2019-02-21 14:31:33 -07:00
Hendrik Muhs	1efb01661c	set minimum supported version (#39043 ) (#39051 ) change the minimum supported version of data frame transform	2019-02-18 15:41:25 +01:00
Hendrik Muhs	4f662bd289	Add data frame feature (#38934 ) (#39029 ) The data frame plugin allows users to create feature indexes by pivoting a source index. In a nutshell this can be understood as reindex supporting aggregations or similar to the so called entity centric indexing. Full history is provided in: feature/data-frame-transforms	2019-02-18 11:07:29 +01:00

1 2 3 4

172 Commits