Commit Graph

283 Commits

Author SHA1 Message Date
Andrei Dan 7dcdaeae49
Default to @timestamp in composable template datastream definition (#59317) (#59516)
This makes the data_stream timestamp field specification optional when
defining a composable template.
When there isn't one specified it will default to `@timestamp`.

(cherry picked from commit 5609353c5d164e15a636c22019c9c17fa98aac30)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-07-14 12:36:54 +01:00
David Kyle 054d5236d4 Mute RegressionIT failure (#59414)
For #59413
2020-07-13 14:12:19 +01:00
Dimitris Athanasiou d07b11b86b
[7.x][ML] Perform test inference on java (#58877) (#59298)
Since we are able to load the inference model
and perform inference in java, we no longer need
to rely on the analytics process to be performing
test inference on the docs that were not used for
training. The benefit is that we do not need to
send test docs and fit them in memory of the c++
process.

Backport of #58877

Co-authored-by: Dimitris Athanasiou <dimitris@elastic.co>

Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
2020-07-09 16:30:49 +03:00
Martijn van Groningen 17bd559253
Fix the timestamp field of a data stream to @timestamp (#59210)
Backport of #59076 to 7.x branch.

The commit makes the following changes:
* The timestamp field of a data stream definition in a composable
  index template can only be set to '@timestamp'.
* Removed custom data stream timestamp field validation and reuse the validation from `TimestampFieldMapper` and
  instead only check that the _timestamp field mapping has been defined on a backing index of a data stream.
* Moved code that injects _timestamp meta field mapping from `MetadataCreateIndexService#applyCreateIndexRequestWithV2Template58956(...)` method
  to `MetadataIndexTemplateService#collectMappings(...)` method.
* Fixed a bug (#58956) that cases timestamp field validation to be performed
  for each template and instead of the final mappings that is created.
* only apply _timestamp meta field if index is created as part of a data stream or data stream rollover,
this fixes a docs test, where a regular index creation matches (logs-*) with a template with a data stream definition.

Relates to #58642
Relates to #53100
Closes #58956
Closes #58583
2020-07-08 17:30:46 +02:00
Benjamin Trent e343e066fc
[7.x] [ML] prefer secondary auth headers on evaluate (#59167) (#59183)
* [ML] prefer secondary auth headers on evaluate (#59167)

We should prefer the secondary auth headers when evaluating a data frame
2020-07-07 15:34:47 -04:00
David Roberts e217f9a1e8
[ML] Wait for shards to initialize after creating ML internal indices (#59087)
There have been a few test failures that are likely caused by tests
performing actions that use ML indices immediately after the actions
that create those ML indices.  Currently this can result in attempts
to search the newly created index before its shards have initialized.

This change makes the method that creates the internal ML indices
that have been affected by this problem (state and stats) wait for
the shards to be initialized before returning.

Backport of #59027
2020-07-07 10:52:10 +01:00
Martijn van Groningen f0dd9b4ace
Add data stream timestamp validation via metadata field mapper (#59002)
Backport of #58582 to 7.x branch.

This commit adds a new metadata field mapper that validates,
that a document has exactly a single timestamp value in the data stream timestamp field and
that the timestamp field mapping only has `type`, `meta` or `format` attributes configured.
Other attributes can affect the guarantee that an index with this meta field mapper has a
useable timestamp field.

The MetadataCreateIndexService inserts a data stream timestamp field mapper whenever
a new backing index of a data stream is created.

Relates to #53100
2020-07-06 11:32:33 +02:00
Przemysław Witek 751e84e4c8
Rename regression evaluation metrics to make the names consistent with loss functions (#58887) (#58927) 2020-07-02 17:35:55 +02:00
Przemysław Witek 8e074c4495
Rename "error" field to "value" for consistency between metrics (#58726) (#58870) 2020-07-02 09:08:56 +02:00
Benjamin Trent c64e283dbf
[7.x] [ML] handles compressed model stream from native process (#58009) (#58836)
* [ML] handles compressed model stream from native process (#58009)

This moves model storage from handling the fully parsed JSON string to handling two separate types of documents.

1. ModelSizeInfo which contains model size information 
2. TrainedModelDefinitionChunk which contains a particular chunk of the compressed model definition string.

`model_size_info` is assumed to be handled first. This will generate the model_id and store the initial trained model config object. Then each chunk is assumed to be in correct order for concatenating the chunks to get a compressed definition.


Native side change: https://github.com/elastic/ml-cpp/pull/1349
2020-07-01 15:14:31 -04:00
Przemysław Witek 909649dd15
[7.x] Implement pseudo Huber loss (PseudoHuber) evaluation metric for regression analysis (#58734) (#58825) 2020-07-01 14:52:06 +02:00
Julie Tibshirani ab65a57d70
Merge mappings for composable index templates (#58709)
This PR implements recursive mapping merging for composable index templates.

When creating an index, we perform the following:
* Add each component template mapping in order, merging each one in after the
last.
* Merge in the index template mappings (if present).
* Merge in the mappings on the index request itself (if present).

Some principles:
* All 'structural' changes are disallowed (but everything else is fine). An
object mapper can never be changed between `type: object` and `type: nested`. A
field mapper can never be changed to an object mapper, and vice versa.
* Generally, each section is merged recursively. This includes `object`
mappings, as well as root options like `dynamic_templates` and `meta`. Once we
reach 'leaf components' like field definitions, they always overwrite an
existing one instead of being merged.

Relates to #53101.
2020-06-30 08:01:37 -07:00
David Roberts d9e0e0bf95
[ML] Pass through the stop-on-warn setting for categorization jobs (#58738)
When per_partition_categorization.stop_on_warn is set for an analysis
config it is now passed through to the autodetect C++ process.

Also adds some end-to-end tests that exercise the functionality
added in elastic/ml-cpp#1356

Backport of #58632
2020-06-30 15:17:04 +01:00
Rene Groeschke d952b101e6
Replace compile configuration usage with api (7.x backport) (#58721)
* Replace compile configuration usage with api (#58451)

- Use java-library instead of plugin to allow api configuration usage
- Remove explicit references to runtime configurations in dependency declarations
- Make test runtime classpath input for testing convention
  - required as java library will by default not have build jar file
  - jar file is now explicit input of the task and gradle will ensure its properly build

* Fix compile usages in 7.x branch
2020-06-30 15:57:41 +02:00
Przemysław Witek 9ea9b7bd3b
[7.x] Implement MSLE (MeanSquaredLogarithmicError) evaluation metric for regression analysis (#58684) (#58731) 2020-06-30 14:09:11 +02:00
Przemysław Witek 3f7c45472e
[7.x] Introduce DataFrameAnalyticsConfig update API (#58302) (#58648) 2020-06-29 10:56:11 +02:00
Benjamin Trent 7a202b149e
Muting analytics tests (#58617) (#58618) 2020-06-26 16:50:59 -04:00
Benjamin Trent add8ff1ad3
[ML] assume data streams are enabled in data stream tests (#58502) (#58508) 2020-06-24 14:14:48 -04:00
Przemysław Witek 551b8bcd73
[7.x] Use static methods (rather than constants) to obtain .ml-meta and .ml-config index names (#58484) (#58490) 2020-06-24 15:52:45 +02:00
Luca Cavanna dbbf2772d8 Mute newly added ml data streams tests (#58492)
Relates to #58491
2020-06-24 15:11:40 +02:00
Benjamin Trent a9b868b7a9
[7.x] [ML] allow data streams to be expanded for analytics and transforms (#58280) (#58455)
This commits allows data streams to be a valid source for analytics and transforms.

Data streams are fairly transparent and our `_search` and `_reindex` actions work without error.

For `_transforms` the check-pointing works as desired as well. Data streams are effectively treated as an `alias` and the backing index values are stored within checkpointing information.
2020-06-23 14:40:35 -04:00
David Roberts 0d6bfd0ac3
[7.x][ML] Fix wire serialization for flush acknowledgements (#58443)
There was a discrepancy in the implementation of flush
acknowledgements: most of the class was designed on the
basis that the "last finalized bucket time" could be null
but the wire serialization assumed that it was never
null.  This works because, the C++ sends zero "last
finalized bucket time" when it is not known or not
relevant.  But then the Java code will print that to
XContent as it is assuming null represents not known or
not relevant.

This change corrects the discrepancies.  Internally within
the class null represents not known or not relevant, but
this is translated from/to 0 for communications from the
C++ and old nodes that have the bug.

Additionally I switched from Date to Instant for this
class and made the member variables final to modernise it
a bit.

Backport of #58413
2020-06-23 16:42:06 +01:00
Benjamin Trent bf8641aa15
[7.x] [ML] calculate cache misses for inference and return in stats (#58252) (#58363)
When a local model is constructed, the cache hit miss count is incremented.

When a user calls _stats, we will include the sum cache hit miss count across ALL nodes. This statistic is important to in comparing against the inference_count. If the cache hit miss count is near the inference_count it indicates that the cache is overburdened, or inappropriately configured.
2020-06-19 09:46:51 -04:00
Przemysław Witek 9dd3d5aa48
[7.x] Delete auto-generated annotations when model snapshot is reverted (#58240) (#58335) 2020-06-18 17:59:52 +02:00
Przemysław Witek b22e91cefc
[7.x] Delete auto-generated annotations when job is deleted. (#58169) (#58219) 2020-06-17 09:17:20 +02:00
Rene Groeschke 01e9126588
Remove deprecated usage of testCompile configuration (#57921) (#58083)
* Remove usage of deprecated testCompile configuration
* Replace testCompile usage by testImplementation
* Make testImplementation non transitive by default (as we did for testCompile)
* Update CONTRIBUTING about using testImplementation for test dependencies
* Fail on testCompile configuration usage
2020-06-14 22:30:44 +02:00
Valeriy Khakhutskyy c0f368bbf3
[7.x][ML] Adjust assertion for job case memory usage estimates (#57929)
Since we change the memory estimates for data frame analytics jobs from worst case to a realistic case, the strict less-than assertion in the test does not hold anymore. I replaced it with a less-or-equal-than assertion.

Backport or #57882
2020-06-10 15:17:16 +02:00
Benjamin Trent 9666a895f7
[ML] inference performance optimizations and refactor (#57674) (#57753)
This is a major refactor of the underlying inference logic.

The main refactor is now we are separating the model configuration and
the inference interfaces.

This has the following benefits:
 - we can store extra things with the model that are not
   necessary for inference (i.e. treenode split information gain)
 - we can optimize inference separate from model serialization and storage.
 - The user is oblivious to the optimizations (other than seeing the benefits).

A major part of this commit is removing all inference related methods from the
trained model configurations (ensemble, tree, etc.) and moving them to a new class.

This new class satisfies a new interface that is ONLY for inference.

The optimizations applied currently are:
- feature maps are flattened once
- feature extraction only happens once at the highest level
  (improves inference + feature importance through put)
- Only storing what we need for inference + feature importance on heap
2020-06-05 14:20:58 -04:00
Przemysław Witek 6b5f49d097
[7.x] Introduce ModelPlotConfig. annotations_enabled setting (#57539) (#57641) 2020-06-04 15:15:35 +02:00
Benjamin Trent 34f1e0b6bb
[7.x] [ML] mark forecasts for force closed/failed jobs as failed (#57143) (#57374)
* [ML] mark forecasts for force closed/failed jobs as failed (#57143)

forecasts that are still running should be marked as failed/finished in the following scenarios:

- Job is force closed
- Job is re-assigned to another node.

Forecasts are not "resilient". Their execution does not continue after a node failure. Consequently, forecasts marked as STARTED or SCHEDULED should be flagged as failed. These forecasts can then be deleted.

Additionally, force closing a job kills the native task directly. This means that if a forecast was running, it is not allowed to complete and could still have the status of `STARTED` in the index.

relates to https://github.com/elastic/elasticsearch/issues/56419
2020-05-29 14:48:10 -04:00
Benjamin Trent 35d5126cea
[7.x] [ML] adds new for_export flag to GET _ml/inference API (#57351) (#57368)
* [ML] adds new for_export flag to GET _ml/inference API (#57351)

Adds a new boolean flag, `for_export` to the `GET _ml/inference/<model_id>` API.

This flag is useful for moving models between clusters.
2020-05-29 14:01:08 -04:00
Benjamin Trent c8374dc9f3
[ML] add max_model_memory parameter to forecast request (#57254) (#57355)
This adds a max_model_memory setting to forecast requests. 
This setting can take a string value that is formatted according to byte sizes (i.e. "50mb", "150mb").

The default value is `20mb`.

There is a HARD limit at `500mb` which will throw an error if used.

If the limit is larger than 40% the anomaly job's configured model limit, the forecast limit is reduced to be strictly lower than that value. This reduction is logged and audited.

related native change: https://github.com/elastic/ml-cpp/pull/1238

closes: https://github.com/elastic/elasticsearch/issues/56420
2020-05-29 11:16:08 -04:00
Przemysław Witek ea2012778e
Mute failing test (#57112) (#57113) 2020-05-25 14:06:29 +02:00
Benjamin Trent 297f864884
[ML] relax throttling on expired data cleanup (#56711) (#56895)
Throttling nightly cleanup as much as we do has been over cautious.

Night cleanup should be more lenient in its throttling. We still
keep the same batch size, but now the requests per second scale
with the number of data nodes. If we have more than 5 data nodes,
we don't throttle at all.

Additionally, the API now has `requests_per_second` and `timeout` set.
So users calling the API directly can set the throttling.

This commit also adds a new setting `xpack.ml.nightly_maintenance_requests_per_second`.
This will allow users to adjust throttling of the nightly maintenance.
2020-05-18 08:46:42 -04:00
Dimitris Athanasiou 011e995165
[7.x][ML] Unmute ClssificationIT.testDependentVariableCardinalityTooHighButWithQueryMakesItWithinRange (#56268) (#56287)
Closes #56240
2020-05-06 18:20:46 +03:00
Julie Tibshirani 49de092b38 Mute RegressionIT.testTwoJobsWithSameRandomizeSeedUseSameTrainingSet. 2020-05-05 16:25:36 -07:00
Julie Tibshirani 63062ec7bd Mute ClassificationIT.testDependentVariableCardinalityTooHighButWithQueryMakesItWithinRange. 2020-05-05 13:48:35 -07:00
Benjamin Trent e1c5ca421e
[7.x] [ML] lay ground work for handling >1 result indices (#55892) (#56192)
* [ML] lay ground work for handling >1 result indices (#55892)

This commit removes all but one reference to `getInitialResultsIndexName`. 
This is to support more than one result index for a single job.
2020-05-05 15:54:08 -04:00
David Roberts 7aa0daaabd
[7.x][ML] More advanced model snapshot retention options (#56194)
This PR implements the following changes to make ML model snapshot
retention more flexible in advance of adding a UI for the feature in
an upcoming release.

- The default for `model_snapshot_retention_days` for new jobs is now
  10 instead of 1
- There is a new job setting, `daily_model_snapshot_retention_after_days`,
  that defaults to 1 for new jobs and `model_snapshot_retention_days`
  for pre-7.8 jobs
- For days that are older than `model_snapshot_retention_days`, all
  model snapshots are deleted as before
- For days that are in between `daily_model_snapshot_retention_after_days`
  and `model_snapshot_retention_days` all but the first model snapshot
  for that day are deleted
- The `retain` setting of model snapshots is still respected to allow
  selected model snapshots to be retained indefinitely

Backport of #56125
2020-05-05 14:31:58 +01:00
Dimitris Athanasiou 75dadb7a6d
[7.x][ML] Add loss_function to regression (#56118) (#56187)
Adds parameters `loss_function` and `loss_function_parameter`
to regression.

Backport of #56118
2020-05-05 14:59:51 +03:00
Dimitris Athanasiou 17b904def5
[7.x][ML] Decouple DFA progress testing from analyses phases (#55925) (#56024)
This refactors native integ tests to assert progress without
expecting explicit phases for analyses. We can test those with
yaml tests in a single place.

Backport of #55925
2020-04-30 17:05:47 +03:00
Przemysław Witek c89917c799
Register DFA jobs on putAnalytics rather than via a separate method (#55458) (#55708) 2020-04-24 10:59:32 +02:00
Dimitris Athanasiou b8379872a7
[7.x][ML] Logs error when DFA task is set to failed (#55545) (#55668)
Also unmutes the integ test that stops and restarts
an outlier detection job with the hope of learning more
of the failure in #55068.

Backport of #55545

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-04-24 11:06:07 +03:00
David Roberts da5aeb8be7
[ML] Return assigned node in start/open job/datafeed response (#55570)
Adds a "node" field to the response from the following endpoints:

1. Open anomaly detection job
2. Start datafeed
3. Start data frame analytics job

If the job or datafeed is assigned to a node immediately then
this field will return the ID of that node.

In the case where a job or datafeed is opened or started lazily
the node field will contain an empty string.  Clients that want
to test whether a job or datafeed was opened or started lazily
can therefore check for this.

Backport of #55473
2020-04-22 12:06:53 +01:00
Stuart Tettemer 93a2e9b0f9
Test: MockScoreScript can be cacheable. (#55499)
Backport: 0ed1eb5
2020-04-20 17:09:58 -06:00
Benjamin Trent cabff65aec
[ML] Fixing inference stats race condition (#55163) (#55486)
`updateAndGet` could actually call the internal method more than once on contention.
If I read the JavaDocs, it says:
```* @param updateFunction a side-effect-free function```
So, it could be getting multiple updates on contention, thus having a race condition where stats are double counted.

To fix, I am going to use a `ReadWriteLock`. The `LongAdder` objects allows fast thread safe writes in high contention environments. These can be protected by the `ReadWriteLock::readLock`.

When stats are persisted, I need to call reset on all these adders. This is NOT thread safe if additions are taking place concurrently. So, I am going to protect with `ReadWriteLock::writeLock`.

This should prevent race conditions while allowing high (ish) throughput in the highly contention paths in inference.

I did some simple throughput tests and this change is not significantly slower and is simpler to grok (IMO).

closes  https://github.com/elastic/elasticsearch/issues/54786
2020-04-20 16:21:18 -04:00
Benjamin Trent fa0373a19f
[7.x] [ML] Fix log spam and disable ILM/SLM history for native ML tests (#55475)
* [ML] fix native ML test log spam (#55459)

This adds a dependency to ingest common. This removes the log spam resulting from basic plugins being enabled that require the common ingest processors.

* removing unnecessary changes

* removing unused imports

* removing unnecessary java setting
2020-04-20 15:41:30 -04:00
William Brafford 7817948926 Disable monitoring in ML multinode tests (#55461)
Removing the deprecated "xpack.monitoring.enabled" setting introduced
log spam and potentially some failures in ML tests. It's possible to use
a different, non-deprecated setting to disable monitoring, so we do that
here.
2020-04-20 10:51:16 -04:00
Przemysław Witek 7d5f74e964
Fix and unmute testSetUpgradeMode_ExistingTaskGetsUnassigned (#55368) (#55452) 2020-04-20 13:29:29 +02:00
William Brafford 49e30b15a2
Deprecate disabling basic-license features (#54816) (#55405)
We believe there's no longer a need to be able to disable basic-license
features completely using the "xpack.*.enabled" settings. If users don't
want to use those features, they simply don't need to use them. Having
such features always available lets us build more complex features that
assume basic-license features are present.

This commit deprecates settings of the form "xpack.*.enabled" for
basic-license features, excluding "security", which is a special case.
It also removes deprecated settings from integration tests and unit
tests where they're not directly relevant; e.g. monitoring and ILM are
no longer disabled in many integration tests.
2020-04-17 15:04:17 -04:00