67 Commits

Author SHA1 Message Date
Ryan Ernst
6fd8924c5a Switch run task to use real distro (#41590)
The run task is supposed to run elasticsearch with the given plugin or
module. However, for modules, this is most realistic if using the full
distribution. This commit changes the run setup to use the default or
oss as appropriate.
2019-05-06 12:34:07 -07:00
Benjamin Trent
50fc27e9a0
[ML] addresses preview bug, and adds check to PUT (#41803) (#41850) 2019-05-06 10:56:26 -05:00
Hendrik Muhs
d54a921032 remove unused import 2019-05-06 10:14:35 +02:00
Hendrik Muhs
0c03707704 [ML-DataFrame] reset/clear the position after indexer is done (#41736)
reset/clear the position after indexer is done
2019-05-06 09:41:51 +02:00
Benjamin Trent
b69e28177b
[ML] rewriting stats gathering to use callbacks instead of a latch (#41793) (#41804) 2019-05-03 18:18:27 -05:00
Hendrik Muhs
00af42fefe move checkpoints into x-pack core and introduce base classes for data frame tests (#41783)
move checkpoints into x-pack core and introduce base classes for data frame tests
2019-05-03 14:16:25 +02:00
Hendrik Muhs
befe2a45b9 [ML-DataFrame] refactor pivot to only take the pivot config (#41763)
refactor pivot class to only take the config at construction, other parameters are passed in as part of
method that require them
2019-05-03 13:37:51 +02:00
Benjamin Trent
33b4032fab
[ML] Correct indexer state on task re-allocation (#41724) (#41751) 2019-05-02 12:01:59 -05:00
Benjamin Trent
a70f796edd
[ML] fix array oob in IDGenerator and adjust format for mapping (#41703) (#41717)
* [ML] fix array oob in IDGenerator and adjust format for mapping

* Update DataFramePivotRestIT.java
2019-05-02 11:09:42 -05:00
Hendrik Muhs
be7ec5a47a simplify indexer by moving members to base class (#41741)
simplify indexer by moving members to base class
2019-05-02 16:08:08 +02:00
Benjamin Trent
92a820bc1a
[ML] Add bucket_script agg support to data frames (#41594) (#41639) 2019-04-29 10:14:17 -05:00
Benjamin Trent
a0990ca239
[ML] cleanup + adding description field to transforms (#41554) (#41605)
* [ML] cleanup + adding description field to transforms

* making description length have a max of 1k
2019-04-26 16:50:59 -05:00
Benjamin Trent
3ccb48e516
[ML] data frame, verify primary shards are active for configs index before task start (#41551) (#41580) 2019-04-26 10:23:43 -05:00
Benjamin Trent
4836ff7bcd
[ML] add multi node integ tests for data frames (#41508) (#41552)
* [ML] adding native-multi-node-integTests for data frames'

* addressing streaming issues

* formatting fixes

* Addressing PR comments
2019-04-26 07:18:49 -05:00
Benjamin Trent
08843ba62b
[ML] Adds progress reporting for transforms (#41278) (#41529)
* [ML] Adds progress reporting for transforms

* fixing after master merge

* Addressing PR comments

* removing unused imports

* Adjusting afterKey handling and percentage to be 100*

* Making sure it is a linked hashmap for serialization

* removing unused import

* addressing PR comments

* removing unused import

* simplifying code, only storing total docs and decrementing

* adjusting for rewrite

* removing initial progress gathering from executor
2019-04-25 11:23:12 -05:00
Hendrik Muhs
13d720f4a7 [ML-DataFrame] Replace Streamable w/ Writeable (#41477)
replace usages of Streamable with Writeable

Relates to #36176
2019-04-24 17:17:34 +02:00
Benjamin Trent
e2f8ffdde8
[ML][Data Frame] Moving destination creation to _start (#41416) (#41433)
* [ML][Data Frame] Moving destination creation to _start

* slight refactor of DataFrameAuditor constructor
2019-04-23 09:32:57 -05:00
David Kyle
1d2365f5b6 [ML-DataFrame] Refactorings and tidying (#41248)
Remove unnecessary generic params from SingleGroupSource
and unused code from the HLRC
2019-04-17 14:58:26 +01:00
David Kyle
711d2545aa [ML-DataFrame] Resolve random test failure using deterministic name (#41262) 2019-04-17 09:04:20 +01:00
Hendrik Muhs
02247cc7df [ML-DataFrame] adapt page size on circuit breaker responses (#41149)
handle circuit breaker response and adapt page size to reduce memory pressure, reduce preview buckets to 100, initial page size to 500
2019-04-16 19:49:43 +02:00
David Kyle
2b539f8347 [ML DataFrame] Data Frame stop all (#41156)
Wild card support for the data frame stop API
2019-04-15 15:04:28 +01:00
Benjamin Trent
9e32e36799
[ML] fixing test related to #40963 (#41074) (#41116) 2019-04-11 11:19:56 -05:00
Christoph Büscher
d00d3f4afa Mute DataFrameTransformCheckpointTests#testGetBehind 2019-04-10 15:55:09 +02:00
Hendrik Muhs
f9018ab11b [ML-DataFrame] create checkpoints on every new run (#40725)
Use the checkpoint service to create a checkpoint on every new run. Expose checkpoints stats on _stats endpoint.
2019-04-10 09:14:11 +02:00
Julie Tibshirani
0702c72151 Mute DataFrameGetAndGetStatsIT#testGetPersistedStatsWithoutTask.
Tracked in #40963.
2019-04-09 16:39:16 -07:00
Hendrik Muhs
d5fcbf2f4a refactor onStart and onFinish to take runnables and executed them guarded by state (#40855)
refactor onStart and onFinish to take action listeners and execute them when indexer is in indexing state.
2019-04-07 21:46:37 +02:00
Benjamin Trent
a8dbb07546
[ML] Changes default destination index field mapping and adds scripted_metric agg (#40750) (#40846)
* [ML] Allowing destination index mappings to have dynamic types, adds script_metric agg

* Making dynamic|source mapping explicit
2019-04-05 11:34:20 -05:00
Benjamin Trent
665f0d81aa
[ML] refactoring start task a bit, removing unused code (#40798) (#40845) 2019-04-05 09:01:01 -05:00
Hendrik Muhs
31e79a73d7 add HLRC protocol tests for transform state and stats (#40766)
adds HLRC protocol tests for state and stats hrlc clients
2019-04-03 12:51:15 +02:00
Benjamin Trent
945e7ca01e
[ML] Periodically persist data-frame running statistics to internal index (#40650) (#40729)
* [ML] Add mappings, serialization, and hooks to persist stats

* Adding tests for transforms without tasks having stats persisted

* intermittent commit

* Adjusting usage stats to account for stored stats docs

* Adding tests for id expander

* Addressing PR comments

* removing unused import

* adding shard failures to the task response
2019-04-02 14:16:55 -05:00
Benjamin Trent
4842d7fb7d
[ML] addressing test failure (#40701) (#40728)
* [ML] Fixing test

* adjusting line lengths

* marking valid seqno as final
2019-04-02 12:33:51 -05:00
Benjamin Trent
29180cefac
[ML] fix test check as randomness allows for different hours (#40536) (#40727)
* [ML] fix test check as randomness allows for different hours

* Re-enabling test
2019-04-02 12:33:35 -05:00
Benjamin Trent
655e3d8f75
[ML] fix test, should account for async nature of audit (#40637) (#40683) 2019-04-01 10:00:32 -05:00
Luca Cavanna
873c5638e6 Mute DataFrameAuditorIT#testAuditorWritesAudits
Relates to #40594
2019-03-28 16:53:54 +01:00
David Kyle
6ef657c5ad
[ML] Data Frame minor tidy ups (#40580)
Remove Xlint-rawtypes option and remove unused request builders.
Not all requests need to implement ToXContent.
2019-03-28 12:27:46 +00:00
David Kyle
13d4d73ce3 Mute DataFrameTaskFailedStateIT.testFailureStateInteraction (#40544) 2019-03-27 17:39:44 +00:00
Benjamin Trent
95a0c524a1
Muting test #40368 (#40542) 2019-03-27 12:11:10 -05:00
Benjamin Trent
be67752c34
Muting test related to #40537 (#40539) 2019-03-27 11:47:09 -05:00
David Kyle
61845dd38b [ML] Fix serialisation of Start Data Frame request (#40483) 2019-03-27 12:56:34 +00:00
Benjamin Trent
12943c5d2c
[ML] Add data frame task state object and field (#40169) (#40490)
* [ML] Add data frame task state object and field

* A new state item is added so that the overall task state can be
accoutned for
* A new FAILED state and reason have been added as well so that failures
can be shown to the user for optional correction

* Addressing PR comments

* adjusting after master merge

* addressing pr comment

* Adjusting auditor usage with failure state

* Refactor, renamed state items to task_state and indexer_state

* Adding todo and removing redundant auditor call

* Address HLRC changes and PR comment

* adjusting hlrc IT test
2019-03-27 06:53:58 -05:00
Hendrik Muhs
f4e56118c2 [ML] generate unique doc ids for data frame (#40382)
create and use unique, deterministic document ids based on the grouping values.

This is a pre-requisite for updating documents as well as preventing duplicates after a hard failure during indexing.
2019-03-27 08:27:05 +01:00
Benjamin Trent
7b4f964708
[ML] make source and dest objects in the transform config (#40337) (#40396)
* [ML] make source and dest objects in the transform config

* addressing PR comments

* Fixing compilation post merge

* adding comment for Arrays.hashCode

* addressing changes for moving dest to object

* fixing data_frame yml tests

* fixing API test
2019-03-25 07:16:41 -05:00
Hendrik Muhs
38afc9f27d refresh audit index before searching (#40401)
refresh the audit index before searching
2019-03-25 11:57:57 +01:00
Benjamin Trent
a30bf27b2f
[ML] add auditor to data frame plugin (#40012) (#40394)
* [Data Frame] add auditor

* Adjusting Level, Auditor, and message to address pr comments

* Addressing PR comments
2019-03-23 18:56:44 -05:00
Benjamin Trent
2dd879abac
[ML] adds support for non-numeric mapped types (#40220) (#40380)
* [ML] adds support for non-numeric mapped types and mapping overrides

* correcting hlrc compilation issues after merge

* removing mapping_override option

* clearing up unnecessary changes
2019-03-23 14:04:14 -05:00
Benjamin Trent
88f510ffc2
[ML] making test more determinate (#40374) (#40381)
* [ML] making test more determinate

* unmuting test
2019-03-23 12:15:37 -05:00
Benjamin Trent
05460cca58
Muting test testExtractIndexCheckpointsInconsistentGlobalCheckpoints (#40370) 2019-03-22 13:25:48 -05:00
Hendrik Muhs
5a0c32833e Add a checkpoint service for data frame transforms (#39836)
Add a checkpoint service for data frame transforms, which allows to ask for a checkpoint of the
source. In future these checkpoints will be stored in the internal index to

 - detect upstream changes
 - updating the data frame without a full re-run
 - allow data frame clients to checkpoint themselves
2019-03-22 10:25:30 +01:00
David Kyle
a4cb92a300
[ML] Data Frame HLRC Preview API (#40258) 2019-03-21 09:38:27 +00:00
Benjamin Trent
5ae43855fc
[ML] Refactor GET Transforms API (#40015) (#40269)
* [Data Frame] Refactor GET Transforms API:

* Add pagination
* comma delimited list expression support GET transforms
* Flag troublesome internal code for future refactor

* Removing `allow_no_transforms` param, ratcheting down pageparam option

* Changing  DataFrameFeatureSet#usage to not get all configs

* Intermediate commit

* Writing test for batch data gatherer

* Removing unused import

* removing bad println used for debugging

* Updating BatchedDataIterator comments and query

* addressing pr comments

* disallow null scrollId to cause stackoverflow
2019-03-20 19:14:50 -05:00