Commit Graph

651 Commits

Author SHA1 Message Date
Przemysław Witek 7cd997df84
[ML] Make ml internal indices hidden (#52423) (#52509) 2020-02-19 14:02:32 +01:00
David Roberts 9c49868bc5 [TEST] Use busy asserts in ML distributed failure test (#52461)
When changing a job state using a mechanism that doesn't
wait for the desired state to be reached within the production
code the test code needs to loop until the cluster state has
been updated.

Closes #52451
2020-02-18 11:17:37 +00:00
David Roberts 48ccf36db9 [ML] Increase assertBusy timeout in ML node failure tests (#52425)
Following the change to store cluster state in Lucene indices
(#50907) it can take longer for all the cluster state updates
associated with node failure scenarios to be processed during
internal cluster tests where several nodes all run in the same
JVM.
2020-02-17 17:04:18 +00:00
Dimitris Athanasiou ad56802ac6
[7.x][ML] Refactor ML mappings and templates into JSON resources (#51… (#52353)
ML mappings and index templates have so far been created
programmatically. While this had its merits due to static typing,
there is consensus it would be clear to maintain those in json files.
In addition, we are going to adding ILM policies to these indices
and the component for a plugin to register ILM policies is
`IndexTemplateRegistry`. It expects the templates to be in resource
json files.

For the above reasons this commit refactors ML mappings and index
templates into json resource files that are registered via
`MlIndexTemplateRegistry`.

Backport of #51765
2020-02-14 17:16:06 +02:00
Przemysław Witek 0da3af7581
[7.x] [ML] Add _cat/ml/data_frame/analytics API (#52260) (#52312) 2020-02-13 16:55:47 +01:00
Jay Modi 5bcc6fce5c
Remove DeprecationLogger from route objects (#52285)
This commit removes the need for DeprecatedRoute and ReplacedRoute to
have an instance of a DeprecationLogger. Instead the RestController now
has a DeprecationLogger that will be used for all deprecated and
replaced route messages.

Relates #51950
Backport of #52278
2020-02-12 15:05:41 -07:00
Benjamin Trent 2a968f4f2b
[ML] job results provider refactoring (#52012) (#52238)
During a bug hunt, I caught a handful of things (unrelated to the bug) that could be potential issues:

1. Needlessly wrapping in exception handling (minor cleanup)
2. Potential of notifying listeners of a failure multiple times + even trying to notify of a success after a failure notification
2020-02-11 17:54:44 -05:00
David Roberts d1d9c40e71 [ML] Switch poor categorization audit warning to use status field (#52195)
In #51146 a rudimentary check for poor categorization was added to
7.6.

This change replaces that warning based on a Java-side check with
a new one based on the categorization_status field that the ML C++
sets.  categorization_status was added in 7.7 and above by #51879,
so this new warning based on more advanced conditions will also be
in 7.7 and above.

Closes #50749
2020-02-11 15:33:27 +00:00
David Roberts 473468d763 [ML] Better error when persistent task assignment disabled (#52014)
Changes the misleading error message when attempting to open
a job while the "cluster.persistent_tasks.allocation.enable"
setting is set to "none" to a clearer message that names the
setting.

Closes #51956
2020-02-11 15:23:21 +00:00
Dimitris Athanasiou 6086fadf00
[7.x][ML] Prepare to hold additional stats in DF Analytics task (#52134) (#52187)
Refactors `DataFrameAnalyticsTask` to hold a `StatsHolder` object.
That just has a `ProgressTracker` for now but this is paving the
way to add additional stats like memory usage, analysis stats, etc.

Backport #52134
2020-02-11 11:18:45 +02:00
Dimitris Athanasiou cbebc26f50
[7.x][ML] Retry persisting DF Analytics results (#52048) (#52160)
Employs `ResultsPersisterService` from `DataFrameRowsJoiner` in order
to add retries when a data frame analytics job is persisting the results
to the destination data frame.

Backport of #52048
2020-02-11 09:55:00 +02:00
Przemysław Witek c7cc383d33
[7.x] Update persistent state document in the index the document belongs to (#51751) (#52145) 2020-02-10 16:32:34 +01:00
David Roberts 1cefafdd14 [ML] Add new categorization stats to model_size_stats (#52009)
This change adds support for the following new model_size_stats
fields:

- categorized_doc_count
- total_category_count
- frequent_category_count
- rare_category_count
- dead_category_count
- categorization_status

Backport of #51879
2020-02-10 09:10:50 +00:00
Jay Modi 3edadfefd0 RestHandlers declare handled routes (#52123)
This commit changes how RestHandlers are registered with the
RestController so that a RestHandler no longer needs to register itself
with the RestController. Instead the RestHandler interface has new
methods which when called provide information about the routes
(method and path combinations) that are handled by the handler
including any deprecated and/or replaced combinations.

This change also makes the publication of RestHandlers safe since they
no longer publish a reference to themselves within their constructors.

Closes #51622

Co-authored-by: Jason Tedor <jason@tedor.me>

Backport of #51950
2020-02-09 22:48:32 -07:00
Benjamin Trent dffcd021df
[7.x] [ML] Add bwc serialization unit test scaffold (#51889) (#52061)
* [ML] Add bwc serialization unit test scaffold (#51889)

Adds new `AbstractBWCSerializationTestCase` which provides easy scaffolding for BWC serialization unit tests.

These are no replacement for true BWC tests (which execute actual old code). These tests do provide some good coverage for the current code when serializing to/from old versions.

* removing unnecessary override for 7.series branch

* adding necessary import

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-02-07 17:17:11 -05:00
Benjamin Trent 846f87a26e
[ML] allow close/stop for jobs/datafeeds with missing configs (#51888) (#51997)
If the configs are removed (by some horrific means), we should still allow tasks to be cleaned up easily.

Datafeeds and jobs with missing configs are now visible in their respective _stats calls and can be stopped/closed.
2020-02-06 12:10:18 -05:00
Benjamin Trent 79f143907a
[7.x] [ML] add _cat/ml/trained_models API (#51529) (#51936)
* [ML] add _cat/ml/trained_models API (#51529)

This adds _cat/ml/trained_models.
2020-02-05 08:26:44 -05:00
Julie Tibshirani 38ce428831
Create a class to hold field capabilities for one index. (#51844)
Currently, the same class `FieldCapabilities` is used both to represent the
capabilities for one index, and also the merged capabilities across indices. To
help clarify the logic, this PR proposes to create a separate class
`IndexFieldCapabilities` for the capabilities in one index. The refactor will
also help when adding `source_path` information in #49264, since the merged
source path field will have a different structure from the field for a single index.

Individual changes:
* Add a new class IndexFieldCapabilities.
* Remove extra constructor from FieldCapabilities.
* Combine the add and merge methods in FieldCapabilities.Builder.
2020-02-04 11:24:57 -08:00
David Roberts 9d55c45b5a [ML] Improve multiline_start_pattern for CSV in find_file_structure (#51737)
The work to switch file upload over to treating delimited files
like semi-structured text and using the ingest pipeline for CSV
parsing makes the multi-line start pattern used for delimited
files much more critical than it used to be.

Previously it was always based on the time field, even if that
was towards the end of the columns, and no multi-line pattern
was created if no timestamp was detected.

This change improves the multi-line start pattern by:

1. Never creating a multi-line pattern if the sample contained
   only single line records.  This improves the import
   efficiency in a common case.
2. Choosing the leftmost field that has a well-defined pattern,
   whether that be the time field or a boolean/numeric field.
   This reduces the risk of a field with newlines occurring
   earlier, and also means the algorithm doesn't automatically
   fail for data without a timestamp.
2020-02-04 12:37:48 +00:00
Benjamin Trent d293980a09
[7.x] [ML] add GET _cat/ml/datafeeds (#51500) (#51829)
* [ML] add GET _cat/ml/datafeeds (#51500)

This adds GET _cat/ml/datafeeds && _cat/ml/datafeeds/{datafeed_id}

* fixing for java8 compilation
2020-02-03 17:16:33 -05:00
David Roberts d5d8fb26fa [TEST] Remove obsolete test trace logging from NetworkDisruptionIT (#51746)
The issue this logging was added to fix (#49908) was closed in
December and the problem has not recurred so this logging is no
longer needed.
2020-02-03 11:25:53 +00:00
Dimitris Athanasiou 55b5c8f703
[7.x][ML] Remove index.unassigned.node_left.delayed_timeout setting from M… (#51740) (#51764)
This setting was introduced with the purpose of reducing the time took by
tests that shut nodes down. Tests like `MlDistributedFailureIT` and
`NetworkDisruptionIT`. However, it is unfortunate to have to set the value
to an explicit value in production. In addition, and most important, the dynamically
choosing the value for this setting makes it impossible to adopt static index template configs
that we register via `IndexTemplateRegistry`, which we need to use in order to start
registering ILM policies for the ML indices.

This commit removes this setting from our templates. I run the tests a few times and could
not see execution time differing significantly.

Backport of #51740
2020-01-31 20:28:29 +02:00
Benjamin Trent e372854d43
[ML][Inference] Fix model pagination with models as resources (#51573) (#51736)
This adds logic to handle paging problems when the ID pattern + tags reference models stored as resources. 

Most of the complexity comes from the issue where a model stored as a resource could be at the start, or the end of a page or when we are on the last page.
2020-01-31 07:52:19 -05:00
Gordon Brown 10c8179351
Use exclusions list instead of fake system indices (#51586)
This commit switches the strategy for managing dot-prefixed indices that
should be hidden indices from using "fake" system indices to an explicit
exclusions list that must be updated when those indices are converted to
hidden indices.
2020-01-30 16:31:27 -07:00
Benjamin Trent 1380dd439a
[7.x] [ML][Inference] Fix weighted mode definition (#51648) (#51695)
* [ML][Inference] Fix weighted mode definition (#51648)

Weighted mode inaccurately assumed that the "max value" of the input values would be the maximum class value. This does not make sense. 

Weighted Mode should know how many classes there are. Hence the new parameter `num_classes`. This indicates what the maximum class value to be expected.
2020-01-30 15:33:25 -05:00
Henning Andersen 149b68d850 [ML] Fix possible race condition starting datafeed (#51646)
Datafeeds being closed while starting could result in and NPE. This was
handled as any other failure, masking out the NPE. However, this
conflicts with the changes in #50886.

Related to #50886 and #51302
2020-01-30 08:23:45 +01:00
Przemysław Witek 683170b007
Increase the number of indexed documents to increase a chance that there are at least 2 training rows. (#51607) (#51615) 2020-01-29 17:17:19 +01:00
Gordon Brown 89c2834b24
Deprecate creation of dot-prefixed index names except for hidden and system indices (#49959)
This commit deprecates the creation of dot-prefixed index names (e.g.
.watches) unless they are either 1) a hidden index, or 2) registered by
a plugin that extends SystemIndexPlugin. This is the first step
towards more thorough protections for system indices.

This commit also modifies several plugins which use dot-prefixed indices
to register indices they own as system indices, and adds a plugin to
register .tasks as a system index.
2020-01-28 10:01:16 -07:00
David Roberts 550254ec7f [ML] Use CSV ingest processor in find_file_structure ingest pipeline (#51492)
Changes the find_file_structure response to include a CSV
ingest processor in the ingest pipeline it suggests.

Previously the Kibana file upload functionality parsed CSV
in the browser, but by parsing CSV in the ingest pipeline
it makes the Kibana file upload functionality more easily
interchangable with Filebeat such that the configurations
it creates can more easily be used to import data with the
same structure repeatedly in production.
2020-01-28 14:38:43 +00:00
David Roberts 3c223ceea1 [ML] Fix 2 digit year regex in find_file_structure (#51469)
The DATE and DATESTAMP Grok patterns match 2 digit years
as well as 4 digit years.  The pattern determination in
find_file_structure worked correctly in this case, but
the regex used to create a multi-line start pattern was
assuming a 4 digit year.  Also, the quick rule-out
patterns did not always correctly consider 2 digit years,
meaning that detection was inconsistent.

This change fixes both problems, and also extends the
tests for DATE and DATESTAMP to check both 2 and 4 digit
years.
2020-01-27 17:23:18 +00:00
Przemysław Witek dd3e2f1e18
[7.x] Update quantiles document in the index the document belongs to (#51135) (#51415) 2020-01-27 10:13:02 +01:00
Benjamin Trent bf53ca3380
[7.x] [ML] Add _cat/ml/anomaly_detectors API (#51364) (#51408)
[ML] Add _cat/ml/anomaly_detectors API (#51364)
2020-01-24 11:54:22 -05:00
Benjamin Trent fc994d9ce1
[ML][Inference] Adds validations for model PUT (#51376) (#51409)
Adds validations making sure that

* `input.field_names` is not empty
* `ensemble.trained_models` is not empty
* `tree.feature_names` is not empty

closes https://github.com/elastic/elasticsearch/issues/51354
2020-01-24 09:29:12 -05:00
Benjamin Trent 76660a5a4f
[7.x] [ML][Inference] add tags url param to GET (#51330) (#51404)
* [ML][Inference] add tags url param to GET (#51330)

Adds a new URL parameter, `tags` to the GET _ml/inference/<model_id> endpoint.

This parameter allows the list of models to be further reduced to those who contain all the provided tags.
2020-01-24 08:26:58 -05:00
Dimitris Athanasiou 3443d69883
[7.x][ML] Rename DataFrameAnalyticsIndex to DestinationIndex (#51353) (#51356)
As we prepare to introduce a new index for storing additional
information about data frame analytics jobs (e.g. intrumentation),
renaming this class to `DestinationIndex` better captures what it does
and leaves its prior name available for a more suitable use.

Backport of #51353

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-01-24 09:51:48 +02:00
David Kyle 0ac03ac5e7
[ML] Add parsers for inference configuration classes (#51300) 2020-01-22 17:03:01 +00:00
David Kyle ca4b90a001
[ML] Calculate results and snapshot retention using latest bucket timestamps (#51061) (#51301)
The retention period is calculated relative to the last bucket result or snapshot
time rather than wall clock
2020-01-22 14:52:33 +00:00
Dimitris Athanasiou 59687a9384
[7.x][ML] Validate classification dependent_variable cardinality is at lea… (#51232) (#51309)
Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.

Backport of #51232
2020-01-22 16:51:16 +02:00
Benjamin Trent 2a73e849d6
[ML][Inference] fixing ingest IT tests (#51267) (#51311)
Converts InferenceIngestIT into a `ESRestTestCase`.

closes #51201
2020-01-22 09:50:17 -05:00
David Roberts 932c63297f [ML] Fix possible race condition when starting datafeed (#51302)
The ID of the datafeed's associated job was being obtained
frequently by looking up the datafeed task in a map that
was being modified in other threads.  This could lead to
NPEs if the datafeed stopped running at an unexpected time.

This change reduces the number of places where a datafeed's
associated job ID is looked up to avoid the possibility of
failures when the datafeed's task is removed from the map
of running tasks during multi-step operations in other
threads.

Fixes #51285
2020-01-22 11:40:39 +00:00
Przemysław Witek bfcfcdee33
[7.x] Do not copy mapping from dependent variable to prediction field in regression analysis (#51227) (#51288) 2020-01-22 12:36:24 +01:00
Benjamin Trent a9b2bc525e
[ML] address two edge cases for categorization.GrokPatternCreator#findBestGrokMatchFromExamples (#51168) (#51255)
There are two edge cases that can be ran into when example input is matched in a weird way.

1. Recursion depth could continue many many times, resulting in a HUGE runtime cost. I put a limit of 10 recursions (could be adjusted I suppose). 
2. If there are no "fixed regex bits", exploring the grok space would result in a fence-post error during runtime (with assertions turned off)
2020-01-21 10:29:29 -05:00
David Roberts 0fa7db9a95 [ML] Make datafeeds work with nanosecond time fields (#51180)
Allows ML datafeeds to work with time fields that have
the "date_nanos" type _and make use of the extra precision_.
(Previously datafeeds only worked with time fields that were
exact multiples of milliseconds.  So datafeeds would work
with "date_nanos" only if the extra precision over "date" was
not used.)

Relates #49889
2020-01-21 09:59:50 +00:00
Jay Modi 107989df3e
Introduce hidden indices (#51164)
This change introduces a new feature for indices so that they can be
hidden from wildcard expansion. The feature is referred to as hidden
indices. An index can be marked hidden through the use of an index
setting, `index.hidden`, at creation time. One primary use case for
this feature is to have a construct that fits indices that are created
by the stack that contain data used for display to the user and/or
intended for querying by the user. The desire to keep them hidden is
to avoid confusing users when searching all of the data they have
indexed and getting results returned from indices created by the
system.

Hidden indices have the following properties:
* API calls for all indices (empty indices array, _all, or *) will not
  return hidden indices by default.
* Wildcard expansion will not return hidden indices by default unless
  the wildcard pattern begins with a `.`. This behavior is similar to
  shell expansion of wildcards.
* REST API calls can enable the expansion of wildcards to hidden
  indices with the `expand_wildcards` parameter. To expand wildcards
  to hidden indices, use the value `hidden` in conjunction with `open`
  and/or `closed`.
* Creation of a hidden index will ignore global index templates. A
  global index template is one with a match-all pattern.
* Index templates can make an index hidden, with the exception of a
  global index template.
* Accessing a hidden index directly requires no additional parameters.

Backport of #50452
2020-01-17 10:09:01 -07:00
David Roberts 295665b1ea [ML] Add audit warning for 1000 categories found early in job (#51146)
If 1000 different category definitions are created for a job in
the first 100 buckets it processes then an audit warning will now
be created.  (This will cause a yellow warning triangle in the
ML UI's jobs list.)

Such a large number of categories suggests that the field that
categorization is working on is not well suited to the ML
categorization functionality.
2020-01-17 16:28:45 +00:00
Przemysław Witek da73c9104e
[ML] Fix tests randomly failing on CI (#51142) (#51150) 2020-01-17 14:58:58 +01:00
Dimitris Athanasiou b70ebdeb96
[7.x][ML] DF Analytics _explain API should skip object fields (#51115) (#51147)
Object fields cannot be used as features. At the moment _explain
API includes them and even worse it allows it does not error when
an object field is excluded. This creates the expectation to the
user that all children fields will also be excluded while it's not
the case.

This commit omits object fields from the _explain API and also
adds an error if an object field is included or excluded.

Backport of #51115
2020-01-17 14:02:59 +02:00
Przemysław Witek b1a526d5e9
[7.x] [ML] Update DFA progress document in the index the document belongs to (#51111) (#51117) 2020-01-17 08:12:54 +01:00
Tom Veasey 32ec934b15
[7.x][ML] Assert top classes are ordered by score (#51028)
Backport #51003.
2020-01-16 12:23:15 +00:00
David Roberts 1536c3e622 [TEST] Increase ML distributed test job open timeout (#50998)
There have been occasional failures, presumably due to
too many tests running in parallel, caused by jobs taking
around 15 seconds to open.  (You can see the job open
successfully during the cleanup phase shortly after the
failure of the test in these cases.)  This change increases
the wait time from 10 seconds to 20 seconds to reduce the
risk of this happening.
2020-01-15 08:58:55 +00:00