Commit Graph

27 Commits

Author SHA1 Message Date
Dimitris Athanasiou c6a5a6d924
[ML] Put ML filter API response should contain the filter (#31362) 2018-06-15 21:15:35 +01:00
Tanguy Leroux 992c7889ee
Uncouple persistent task state and status (#31031)
This pull request removes the relationship between the state 
of persistent task (as stored in the cluster state) and the status 
of the task (as reported by the Task APIs and used in various 
places) that have been confusing for some time (#29608).

In order to do that, a new PersistentTaskState interface is added. 
This interface represents the persisted state of a persistent task. 
The methods used to update the state of persistent tasks are 
renamed: updatePersistentStatus() becomes updatePersistentTaskState() 
and now takes a PersistentTaskState as a parameter. The 
Task.Status type as been changed to PersistentTaskState in all 
places were it make sense (in persistent task customs in cluster 
state and all other methods that deal with the state of an allocated 
persistent task).
2018-06-15 09:26:47 +02:00
Dimitris Athanasiou 9b293275af
[ML] Add description to ML filters (#31330)
This adds a `description` to ML filters in order
to allow users to describe their filters in a human
readable form which is also editable (filter updates
to be added shortly).
2018-06-14 16:52:32 +01:00
Tom Veasey 66f7dd2c4d
[ML] Update test thresholds to account for changes to memory control (#31289)
To avoid temporary failures, this also disables these tests until elastic/ml-cpp#122 is committed.
2018-06-13 13:12:53 +01:00
Dimitris Athanasiou 5c77ebe89d
[ML] Implement new rules design (#31110)
Rules allow users to supply a detector with domain
knowledge that can improve the quality of the results.
The model detects statistically anomalous results but it
has no knowledge of the meaning of the values being modelled.

For example, a detector that performs a population analysis
over IP addresses could benefit from a list of IP addresses
that the user knows to be safe. Then anomalous results for
those IP addresses will not be created and will not affect
the quantiles either.

Another example would be a detector looking for anomalies
in the median value of CPU utilization. A user might want
to inform the detector that any results where the actual
value is less than 5 is not interesting.

This commit introduces a `custom_rules` field to the `Detector`.
A detector may have multiple rules which are combined with `or`.

A rule has 3 fields: `actions`, `scope` and `conditions`.

Actions is a list of what should happen when the rule applies.
The current options include `skip_result` and `skip_model_update`.
The default value for `actions` is the `skip_result` action.

Scope is optional and allows for applying filters on any of the
partition/over/by field. When not defined the rule applies to
all series. The `filter_id` needs to be specified to match the id
of the filter to be used. Optionally, the `filter_type` can be specified
as either `include` (default) or `exclude`. When set to `include`
the rule applies to entities that are in the filter. When set to
`exclude` the rule only applies to entities not in the filter.

There may be zero or more conditions. A condition requires `applies_to`,
`operator` and `value` to be specified. The `applies_to` value can be
either `actual`, `typical` or `diff_from_typical` and it specifies
the numerical value to which the condition applies. The `operator`
(`lt`, `lte`, `gt`, `gte`) and `value` complete the definition.
Conditions are combined with `and` and allow to specify numerical
conditions for when a rule applies.

A rule must either have a scope or one or more conditions. Finally,
a rule with scope and conditions applies when all of them apply.
2018-06-13 11:20:38 +01:00
Jason Tedor 0bfd18cc8b
Revert upgrade to Netty 4.1.25.Final (#31282)
This reverts upgrading to Netty 4.1.25.Final until we have a cleaner
solution to dealing with the object cleaner thread.
2018-06-12 19:26:18 -04:00
Dimitris Athanasiou 5f84e18c72
[ML][TEST] Mute tests using rules (#31204)
This is in preparation of pushing the new
rules design in the `ml-cpp` side. These
tests will be switched on again after merging
in the new rules implementation.
2018-06-12 11:36:26 +01:00
Jason Tedor 563141c6c9
Upgrade to Netty 4.1.25.Final (#31232)
This commit upgrades us to Netty 4.1.25. This upgrade is more
challenging than past upgrades, all because of a new object cleaner
thread that they have added. This thread requires an additional security
permission (set context class loader, needed to avoid leaks in certain
scenarios). Additionally, there is not a clean way to shutdown this
thread which means that the thread can fail thread leak control during
tests. As such, we have to filter this thread from thread leak control.
2018-06-11 16:55:07 -04:00
Tanguy Leroux bf58660482
Remove all unused imports and fix CRLF (#31207)
The X-Pack opening and the recent other refactorings left a lot of 
unused imports in the codebase. This commit removes them all.
2018-06-11 15:12:12 +02:00
Hendrik Muhs 253b998681
flush job to ensure all results have been written (#31187)
flush ml job to ensure all results have been written

fixes #31173
2018-06-08 07:51:45 +02:00
Hendrik Muhs 5e48ba7cbd
run overflow forecast a 2nd time as regression test for elastic/ml-cpp#110 (#30969)
Improve test to run overflow forecast a 2nd time as regression test for elastic/ml-cpp#110
2018-06-05 08:52:06 +02:00
Dimitris Athanasiou 9141108334
[ML][TEST] Fix bucket count assertion in all tests in ModelPlotsIT (#31026)
This fixes the last remaining test that was missed in #30717.

Closes #30715
2018-06-01 10:51:12 +01:00
David Roberts aafcd85f50
Move persistent task registrations to core (#30755)
Persistent tasks was moved from X-Pack to core in #28455.
However, registration of the named writables and named
X-content was left in X-Pack.

This change moves the registration of the named writables
and named X-content into core.  Additionally, the persistent
task actions are no longer registered in the X-Pack client
plugin, as they are already registered in ActionModule.
2018-05-24 09:17:17 +01:00
David Roberts 2b72adc8ac
[TEST] Reduce forecast overflow to disk test memory limit (#30727)
By default ML native processes are only allowed to use
30% of RAM, so the previous 2GB setting prevented the
test passing on VMs with only 4GB RAM.  This change
reduces the limit to 1200MB, which means it can now
pass on VMs with 4GB RAM.
2018-05-18 19:01:43 +01:00
Dimitris Athanasiou 6bb2a1da22
[ML][TEST] Fix bucket count assertion in ModelPlotsIT (#30717)
As the first record is random, there's a chance it will
be aligned on a bucket start. Thus we need to check the
bucket count is in [23, 24].

Closes #30715
2018-05-18 17:59:01 +03:00
Dimitris Athanasiou 1484a31be5
[ML][TEST] Make AutodetectMemoryLimitIT less fragile (#30716)
These tests aim to check the set model memory limit is
respected. Additionally, it was asserting counts of
partition, by, over fields in an attempt to check that
the used memory is spent meaningfully. However, this
made the tests fragile, as changes in the ml-cpp could
lead to CI failures.

This commit removes those assertions. We are working on
adding tests in ml-cpp that will compensate.
2018-05-18 17:57:20 +03:00
Hendrik Muhs 6c313a9871 This implementation lazily (on 1st forecast request) checks for available
diskspace and creates a subfolder for storing data outside of Lucene
indexes, but as part of the ES data paths.

Details:
 - tmp storage is managed and does not allow allocation if disk space is
   below a threshold (5GB at the moment)
 - tmp storage is supposed to be managed by the native component but in
   case this fails cleanup is provided:
    - on job close
    - on process crash
    - after node crash, on restart
 - available space is re-checked for every forecast call (the native
   component has to check again before writing)

Note: The 1st path that has enough space is chosen on job open (job
close/reopen triggers a new search)
2018-05-18 14:04:09 +02:00
Dimitris Athanasiou 75665a2d3e
[ML] Clean left behind model state docs (#30659)
It is possible for state documents to be
left behind in the state index. This may be
because of bugs or uncontrollable scenarios.
In any case, those documents may take up quite
some disk space when they add up. This commit
adds a step in the expired data deletion that
is part of the daily maintenance service. The
new step searches for state documents that
do not belong to any of the current jobs and
deletes them.

Closes #30551
2018-05-17 17:51:26 +03:00
David Roberts ef0daee850
[TEST] Account for increase in ML C++ memory usage (#30675)
Recent changes to the ML C++ have resulted in higher memory usage,
so fewer "by" fields can be analyzed in a given amount of model
memory.
2018-05-17 12:59:20 +01:00
David Roberts 6a8aa99e3f [TEST] Mute ML test that needs updating to following ml-cpp changes
Relates #30399
2018-05-14 12:49:37 +01:00
Dimitris Athanasiou 2751790aea [ML][TEST] Clean up jobs in ModelPlotIT
Closes #30377
2018-05-04 10:46:19 +01:00
Dimitris Athanasiou a1e23feba2
[ML] Add integration test for model plots (#30359)
Relates #30004
2018-05-03 17:02:45 +01:00
Dimitris Athanasiou 3b260dcfc1
[ML] Account for gaps in data counts after job is reopened (#30294)
This commit fixes an issue with the data diagnostics were
empty buckets are not reported even though they should. Once
a job is reopened, the diagnostics do not get initialized from
the current data counts (especially the latest record timestamp).
The result is that if the data that is sent have a time gap compared
to the previous ones, that gap is not accounted for in the empty bucket
count.

This commit fixes that by initializing the diagnostics with the current
data counts.

Closes #30080
2018-05-03 15:08:24 +01:00
David Kyle cfc66a1fd5 [ML] Wait for updates to established memory usage
Tests need to wait for changes to the job's established memory usage to
propagate and an over enthusiastic optimisation meant jobs were updated
from stale state causing recent change to be lost.
2018-04-24 13:46:58 -04:00
Jason Tedor c7c0e330b8 Rename users
This commit renames users to elasticsearch-users.
2018-04-20 15:34:01 -07:00
Ryan Ernst fab5e21e7d Build: Split distributions into oss and default
This commit makes x-pack a module and adds it to the default
distrubtion. It also creates distributions for zip, tar, deb and rpm
which contain only oss code.
2018-04-20 15:33:57 -07:00
Ryan Ernst 2efd22454a Migrate x-pack-elasticsearch source to elasticsearch 2018-04-20 15:29:54 -07:00