Commit Graph

189 Commits

Author SHA1 Message Date
Zachary Tong 9cc33f4e29 [Rollup] Select best jobs then execute msearch-per-job (elastic/x-pack-elasticsearch#4152)
If there are multiple jobs that are all the "best" (e.g. share the
best interval) we have no way of knowing which is actually the best.
Unfortunately, we cannot just filter for all the jobs in a single
search because their doc_counts can potentially overlap.

To solve this, we execute an msearch-per-job so that the results
stay isolated.  When rewriting the response, we iteratively
unroll and reduce the independent msearch responses into a single
"working tree".  This allows us to intervene if there are
overlapping buckets and manually choose a doc_count.

Job selection is found by recursively descending through the aggregation
tree and independently pruning the list of valid job caps in each branch.
When a leaf node is reached in the branch, the remaining jobs are
sorted by "best'ness" (see comparator in RollupJobIdentifierUtils for the
implementation) and added to a global set of "best jobs". Once
all branches have been evaluated, the final set is returned to the
calling code.

Job "best'ness" is, briefly, the job(s) that have
 - The largest compatible date interval
 - Fewer and larger interval histograms
 - Fewer terms groups

Note: the final set of "best" jobs is not guaranteed to be minimal,
there may be redundant effort due to independent branches choosing
jobs that are subsets of other branches.

Related changes:
- We have to include the job's ID in the rollup doc's
hash, so that different jobs don't overwrite the same summary
document.
- Now that we iteratively reduce the agg tree, the agg framework
injects empty buckets while we're working.  In most cases this
is harmless, but for `avg` aggs the empty bucket is a SumAgg while
any unrolled versions are converted into AvgAggs... causing a cast
exception.  To get around this, avg's are renamed to
`{source_name}.value` to prevent a conflict
- The job filtering has been pushed up into a query filter, since it
applies to the entire msearch rather than just individual agg components
- We no longer add a filter agg clause about the date_histo's interval, because 
that is handled by the job validation and pruning.

Original commit: elastic/x-pack-elasticsearch@995be2a039
2018-03-27 10:33:59 -07:00
Dimitris Athanasiou 5f219bd70f [ML][DOCS] Remove empty rules from docs
Original commit: elastic/x-pack-elasticsearch@dee88e1161
2018-03-23 12:31:36 +00:00
Dimitris Athanasiou c10b2ea631 [ML] Ensure job is not assigned to node that cannot read model_snapshot (elastic/x-pack-elasticsearch#4091)
This adds a minimum compatible version to the model snapshot.
Nodes with a version earlier than that version cannot read
that model snapshot. Thus, such jobs are not assigned to
incompatible nodes.

relates elastic/x-pack-elasticsearch#4077

Original commit: elastic/x-pack-elasticsearch@2ffa6adce0
2018-03-15 17:38:52 +00:00
Tim Brooks 7f7ac08447 Add api to start basic license (elastic/x-pack-elasticsearch#4083)
This is related to elastic/x-pack-elasticsearch#3877. This commit adds a route /start_basic that
will self generate a basic license. The only validation that is
performed is to check that you do not already have a basic license
installed. Additionally, if you lose features from switching to a basic
license, you must acknowledge the changes.

Original commit: elastic/x-pack-elasticsearch@7b8eeb50b1
2018-03-12 14:39:58 -06:00
Dimitris Athanasiou 1ed31af2c6 [ML] Allow model_memory_limit to be reduced (elastic/x-pack-elasticsearch#3998)
Up to now a job update that reduces the model memory limit
was not allowed. However, there could definitely be cases
where reducing the limit is necessary and reasonable.

This commit makes it possible to decrease the limit as long
as it does not go below the current memory usage. We obtain
the latter from the model size stats.

The conditions under which updating the model_memory_limit
is not allowed are now:

 - when the job is open
 - latest model_size_stats.model_bytes < new value

relates elastic/x-pack-elasticsearch#2461

Original commit: elastic/x-pack-elasticsearch@5b35923590
2018-03-08 06:14:18 -08:00
Lisa Cawley 095d6d466c [DOCS] Update types in datafeed resource (elastic/x-pack-elasticsearch#4011)
Original commit: elastic/x-pack-elasticsearch@6692b97c5e
2018-03-07 13:53:49 -08:00
Lisa Cawley 6e87d95f9b [DOCS] Added licensing APIs (elastic/x-pack-elasticsearch#4026)
Original commit: elastic/x-pack-elasticsearch@0e50cc0d64
2018-03-06 09:47:04 -08:00
Dimitris Athanasiou 79d46d1d17 [ML] Set explicit defaults to AnalysisLimits (elastic/x-pack-elasticsearch#4015)
Analysis limits contain settings that affect the resources
used by ML jobs. Those limits always take place. However,
explictly setting them is not required as they have reasonable
defaults. For a long time those defaults lived on the c++ side.
The job could just not have any explicit limits and that meant
defaults would be used at the c++ side. This has the disadvantage
that it is not obvious to the users what these settings are set to.
Additionally, users might not be aware of the settings existence.

On top of that, since 6.1, the default model_memory_limit was lowered
from 4GB to 1GB. For BWC, this meant that jobs where model_memory_limit
is null, the default of 4GB applies. Jobs that were created from 6.1
onwards, contain an explicit setting for model_memory_limit, which is
1GB unless the user sets it differently. This adds additional confusion.

This commit makes analysis limits an always explicit setting on the job.
Regardless of whether the user sets custom limits or not, the job object
(and response) will contain the full analysis limits values.

The possibilities for interpretation of missing values are:

  - the entire analysis_limits is null: this may only happen for jobs
  created prior to 6.1. Thus we set the model_memory_limit to 4GB.
  - analysis_limits are non-null but model_memory_limit is: this also
  may only happen for jobs prior to 6.1. Again, we set memory limit to
  4GB.
  - model_memory_limit is non-null: this either means the user set an
  explicit value or the job was created from 6.1 onwards and it has
  the explicit default of 1GB. We simply keep the given value.

For categorization_examples_limit the default has always been 4, so
we fill that in when it's missing.

Finally, note that we still need to handle potential null values
for the situation of a mixed cluster.

Original commit: elastic/x-pack-elasticsearch@5b6994ef75
2018-02-27 17:49:05 +00:00
Zachary Tong e3543b06ba [Docs] Remove bad cross-book link
Temporary to keep the build green, will figure this out during
the next round of rollup docs work.

Original commit: elastic/x-pack-elasticsearch@7657938ffb
2018-02-23 23:23:51 +00:00
Zachary Tong bf1550a0b2 Rollups for Elasticsearch (elastic/x-pack-elasticsearch#4002)
This adds a new Rollup module to XPack, which allows users to configure periodic "rollup jobs" to pre-aggregate data.  That data is then available later for search through a special RollupSearch API, which mimics the DSL and functionality of regular search.

Rollups are used to drastically reduce the on-disk footprint of metric-based data (e.g. timestamped document with numeric and keyword fields).  It can also be used to speed up aggregations over large datasets, since the rolled data will be considerably smaller and fewer documents to search.

The PR adds seven new endpoints to interact with Rollups; create/get/delete job, start/stop job, a capabilities API similar to field-caps, and a Rollup-enabled search.

Original commit: elastic/x-pack-elasticsearch@dcde91aacf
2018-02-23 17:10:37 -05:00
David Kyle 26516c507e [ML][DOCS] Fix substitution in ML docs code snipppet (elastic/x-pack-elasticsearch#4006)
* Revert "Mute failing ML datafeed docs code snippet"

* Fix substitution

Original commit: elastic/x-pack-elasticsearch@8442863480
2018-02-22 09:32:52 +00:00
Luca Cavanna 79bc6d9a53 Remove AcknowledgedRestListener in favour of RestToXContentListener (elastic/x-pack-elasticsearch#3985)
Adapt to AcknowledgedRestListener removal 

Original commit: elastic/x-pack-elasticsearch@74c08fcf02
2018-02-22 09:13:58 +01:00
Lisa Cawley 1eca36bda9 [DOCS] Clarified model snapshot retention (elastic/x-pack-elasticsearch#4000)
Original commit: elastic/x-pack-elasticsearch@f1bdf5454d
2018-02-21 08:58:17 -08:00
David Kyle 92a9fc8b48 Mute failing ML datafeed docs code snippet
Original commit: elastic/x-pack-elasticsearch@9cfea037bc
2018-02-21 10:03:20 +00:00
lcawley 0e2a39603e [DOCS] Fixed ml.machine_memory example testing
Original commit: elastic/x-pack-elasticsearch@d0fa44ab20
2018-02-20 18:36:56 -08:00
Lisa Cawley e9b4a2d063 [DOCS] Enabled code snippet testing for more ML APIs (elastic/x-pack-elasticsearch#3990)
Original commit: elastic/x-pack-elasticsearch@1b631adff6
2018-02-20 11:08:37 -08:00
Alexander Reelsen c9d77d20fd Watcher: Never return credentials after watch creation... (elastic/x-pack-elasticsearch#3581)
... yet support updates. This commit introduces a few changes of how
watches are put.

The GET Watch API will never return credentials like basic auth
passwords, but a placeholder instead now. If the watcher is enabled to
encrypt sensitive settings, then the original encrypted value is
returned otherwise a "::es_redacted::" place holder.

There have been several Put Watch API changes.

The API now internally uses the Update API and versioning. This has
several implications. First if no version is supplied, we assume an
initial creation. This will work as before, however if a credential is
marked as redacted we will reject storing the watch, so users do not
accidentally store the wrong watch.

The watch xcontent parser now has an additional methods to tell the
caller if redacted passwords have been found. Based on this information
an error can be thrown.

If the user now wants to store a watch that contains a password marked
as redacted, this password will not be part of the toXContent
representation of the watch and in combinatination with update request
the existing password will be merged in. If the encrypted password is
supplied this one will be stored.

The serialization for GetWatchResponse/PutWatchRequest has changed.
The version checks for this will be put into the 6.x branch.

The Watcher UI now needs specify the version, when it wants to store a
watch. This also prevents last-write-wins scenarios and is the reason
why the put/get watch response now contains the internal version.

relates elastic/x-pack-elasticsearch#3089

Original commit: elastic/x-pack-elasticsearch@bb63be9f79
2018-02-20 10:09:27 +01:00
Lisa Cawley 64653e525a [DOCS] Identify informational ML properties (elastic/x-pack-elasticsearch#3773)
Original commit: elastic/x-pack-elasticsearch@cb310b360d
2018-02-19 11:48:09 -08:00
Lisa Cawley 530b709948 [DOCS] Add skip_time to Flush Jobs API (elastic/x-pack-elasticsearch#1955)
Original commit: elastic/x-pack-elasticsearch@352bd336d8
2018-02-19 11:04:12 -08:00
Lisa Cawley 3890875a88 [DOCS] Role Mapping API improvements (elastic/x-pack-elasticsearch#3951)
Original commit: elastic/x-pack-elasticsearch@d300c96c7a
2018-02-16 09:29:19 -08:00
Lisa Cawley 4e0c1d1b60 [DOCS] Enabled more ML code snippet testing (elastic/x-pack-elasticsearch#3764)
Original commit: elastic/x-pack-elasticsearch@518dce3ddd
2018-02-09 09:16:24 -08:00
David Kyle 8e73085047 [ML] Enable adding multiple jobs to a calendar (elastic/x-pack-elasticsearch#3786)
Original commit: elastic/x-pack-elasticsearch@56a70a4580
2018-02-08 11:44:16 +00:00
Lisa Cawley b6c9e96304 [DOCS] Added job ID requirements (elastic/x-pack-elasticsearch#3727)
Original commit: elastic/x-pack-elasticsearch@71f0073708
2018-01-25 09:23:56 -08:00
Lisa Cawley e7c78e05f8 [DOCS] Added ML add and delete calendar event APIs (elastic/x-pack-elasticsearch#3394)
Original commit: elastic/x-pack-elasticsearch@3283af2215
2018-01-24 08:14:23 -08:00
Dimitris Athanasiou 21f692c02b [ML] Further validate calendar_id and add calendar description (elastic/x-pack-elasticsearch#3624)
relates elastic/x-pack-elasticsearch#3595

Original commit: elastic/x-pack-elasticsearch@fade977361
2018-01-19 10:44:39 +00:00
Lisa Cawley 9f6064f9ac [DOCS] Edited documentation for ML categorization_analyzer (elastic/x-pack-elasticsearch#3587)
Original commit: elastic/x-pack-elasticsearch@6dd179107a
2018-01-17 13:11:36 -08:00
Jay Modi 60d4b7e53e Add the ability to refresh tokens obtained via the API (elastic/x-pack-elasticsearch#3468)
This commit adds the ability to refresh tokens that have been obtained by the API using a refresh
token. Refresh tokens are one time use tokens that are valid for 24 hours. The tokens may be used
to get a new access and refresh token if the refresh token has not been invalidated or
already refreshed.

relates elastic/x-pack-elasticsearch#2595

Original commit: elastic/x-pack-elasticsearch@23435eb815
2018-01-17 12:18:44 -07:00
Lisa Cawley a4fad02d9a [DOCS] Added SSL certificates API (elastic/x-pack-elasticsearch#3136)
Original commit: elastic/x-pack-elasticsearch@62cb574fcf
2018-01-17 08:14:02 -08:00
lcawley c243d6eb21 [DOCS] Fixed short descriptions for ML APIs
Original commit: elastic/x-pack-elasticsearch@75937c0da1
2018-01-15 08:44:08 -08:00
lcawley 86f612ae3a [DOCS] Fixed link to Analyze API
Original commit: elastic/x-pack-elasticsearch@e203d839c2
2018-01-15 08:29:08 -08:00
David Roberts e9dafbd78d [DOCS] Add documentation for ML categorization_analyzer (elastic/x-pack-elasticsearch#3554)
This is the documentation for the changes made in elastic/x-pack-elasticsearch#3372.

Relates elastic/machine-learning-cpp#491

Original commit: elastic/x-pack-elasticsearch@7d67e9d894
2018-01-15 15:47:19 +00:00
lcawley 648a4d9cd1 [DOCS] Fixed forecasting links
Original commit: elastic/x-pack-elasticsearch@42c326c3ce
2017-12-21 08:39:49 -08:00
Lisa Cawley b35f1909cc [DOCS] Add forecasting overview (elastic/x-pack-elasticsearch#3263)
* [DOCS] Restructure ML overview

* [DOCS] Added forecasting limitations

* [DOCS] Merged changes to ML overview

* [DOCS] Added forecasting screenshot

* [DOCS] Removed incorrect results info from forecast API

* [DOCS] Addressed feedback about forecasts

* [DOCS] Clarified default forecast duration

Original commit: elastic/x-pack-elasticsearch@1403f2cd2e
2017-12-21 08:14:52 -08:00
lcawley 1cc73a0307 [DOCS] Fixed calendar API titles
Original commit: elastic/x-pack-elasticsearch@77fcbe7b37
2017-12-20 16:37:19 -08:00
lcawley 121bbd5689 [DOCS] Fixed typo in calendar API
Original commit: elastic/x-pack-elasticsearch@8b3989cd45
2017-12-20 15:22:38 -08:00
Lisa Cawley 15e738b184 [DOCS] Added ML calendar APIs (elastic/x-pack-elasticsearch#3328)
* [DOCS] Added ML calendar APIs

* [DOCS] Updated calendar job API titles

* [DOCS] Added more calendar APIs

Original commit: elastic/x-pack-elasticsearch@0910da09eb
2017-12-20 13:52:58 -08:00
Dimitrios Athanasiou f977632a17 [DOCS] Change `detector_rules` to `rules` in ML docs
Original commit: elastic/x-pack-elasticsearch@49699286d3
2017-12-20 14:38:00 +00:00
David Roberts b81c90d6fc [DOCS] Explain ML datafeed run-as integration/limitations (elastic/x-pack-elasticsearch#3311)
Docs for elastic/x-pack-elasticsearch#3254

Original commit: elastic/x-pack-elasticsearch@eec3c7ccce
2017-12-15 14:41:10 +00:00
lcawley 4f1866db69 [DOCS] Updated titles of ML APIs
Original commit: elastic/x-pack-elasticsearch@3b3d856a89
2017-12-14 10:52:49 -08:00
Dimitris Athanasiou a9535c0b5a [ML][DOCS] Correct get-overall-buckets API example (elastic/x-pack-elasticsearch#3269)
Also fixes the score filters explanation for the results APIs.

Original commit: elastic/x-pack-elasticsearch@18cb31ab56
2017-12-08 16:03:51 +00:00
Alexander Reelsen c641a30bc5 Docs: Explain watcher security integration/limitations (elastic/x-pack-elasticsearch#3106)
Original commit: elastic/x-pack-elasticsearch@991e1de267
2017-11-29 14:48:06 +01:00
Lisa Cawley b6f322e72e [DOCS] Enable code snippet testing in open job API (elastic/x-pack-elasticsearch#3053)
* [DOCS] Enable code snippet testing in open job API

* [DOCS] Fixed open job API example

Original commit: elastic/x-pack-elasticsearch@f789041c2a
2017-11-28 08:26:58 -08:00
Dimitrios Athanasiou 84694fa4b4 [ML][DOCS] Fix doc error for forecast API
Original commit: elastic/x-pack-elasticsearch@999045d510
2017-11-28 14:15:40 +00:00
Dimitris Athanasiou e396c61afc [ML] Remove forecast end param (elastic/x-pack-elasticsearch#3121)
The forecast API provides a `duration` parameters
which is the most convenient way of specifying
the span of the forecast. End time is now unnecessary
and possibly confusing.

Relates elastic/machine-learning-cpp#443

Original commit: elastic/x-pack-elasticsearch@04eb0408e7
2017-11-28 10:49:15 +00:00
Lisa Cawley a7456cd87d [DOCS] Enabled code snippet testing for start datafeed API (elastic/x-pack-elasticsearch#3055)
* [DOCS] Enabled code snippet testing for start datafeed API

* [DOCS] Added datafeed creation to build.gradle

Original commit: elastic/x-pack-elasticsearch@1acb452cf0
2017-11-27 10:57:37 -08:00
Lisa Cawley b5d42c40e4 [DOCS] Enabled code snippet testing in stop datafeed API (elastic/x-pack-elasticsearch#3127)
Original commit: elastic/x-pack-elasticsearch@282eb587d5
2017-11-27 10:15:46 -08:00
Russ Cam e4e8870b13 Add opening state to Job states (elastic/x-pack-elasticsearch#2317)
Also updated open state to opened.

Original commit: elastic/x-pack-elasticsearch@663d95db1a
2017-11-24 11:35:51 +00:00
lcawley 4d24748170 [DOCS] Added requirement to forecast API
Original commit: elastic/x-pack-elasticsearch@3f1360ca2b
2017-11-23 15:54:04 -08:00
Lisa Cawley a9b3cd747f [DOCS] Added ML forecast API (elastic/x-pack-elasticsearch#2745)
* [DOCS] Added ML forecast API

* [DOCS] Added forecast API to build.gradle

* [DOCS] Added forecast API example

* [DOCS] Fixed forecast API intro

* [DOCS] Addressed feedback on forecast API

* [DOCS] Added duration to forecast API

* [DOCS] Removed end time from forecast API

* [DOCS] Fixed gradle errors for forecast API

Original commit: elastic/x-pack-elasticsearch@db79e3d5bb
2017-11-23 11:52:37 -08:00
lcawley 32d0c1b0c7 [DOCS] Re-enabled code snippet testing
Original commit: elastic/x-pack-elasticsearch@31fd4c3668
2017-11-17 13:40:46 -08:00