Up to now a job update that reduces the model memory limit
was not allowed. However, there could definitely be cases
where reducing the limit is necessary and reasonable.
This commit makes it possible to decrease the limit as long
as it does not go below the current memory usage. We obtain
the latter from the model size stats.
The conditions under which updating the model_memory_limit
is not allowed are now:
- when the job is open
- latest model_size_stats.model_bytes < new value
relates elastic/x-pack-elasticsearch#2461
Original commit: elastic/x-pack-elasticsearch@5b35923590
Analysis limits contain settings that affect the resources
used by ML jobs. Those limits always take place. However,
explictly setting them is not required as they have reasonable
defaults. For a long time those defaults lived on the c++ side.
The job could just not have any explicit limits and that meant
defaults would be used at the c++ side. This has the disadvantage
that it is not obvious to the users what these settings are set to.
Additionally, users might not be aware of the settings existence.
On top of that, since 6.1, the default model_memory_limit was lowered
from 4GB to 1GB. For BWC, this meant that jobs where model_memory_limit
is null, the default of 4GB applies. Jobs that were created from 6.1
onwards, contain an explicit setting for model_memory_limit, which is
1GB unless the user sets it differently. This adds additional confusion.
This commit makes analysis limits an always explicit setting on the job.
Regardless of whether the user sets custom limits or not, the job object
(and response) will contain the full analysis limits values.
The possibilities for interpretation of missing values are:
- the entire analysis_limits is null: this may only happen for jobs
created prior to 6.1. Thus we set the model_memory_limit to 4GB.
- analysis_limits are non-null but model_memory_limit is: this also
may only happen for jobs prior to 6.1. Again, we set memory limit to
4GB.
- model_memory_limit is non-null: this either means the user set an
explicit value or the job was created from 6.1 onwards and it has
the explicit default of 1GB. We simply keep the given value.
For categorization_examples_limit the default has always been 4, so
we fill that in when it's missing.
Finally, note that we still need to handle potential null values
for the situation of a mixed cluster.
Original commit: elastic/x-pack-elasticsearch@5b6994ef75
* [DOCS] Enabled code snippet testing for start datafeed API
* [DOCS] Added datafeed creation to build.gradle
Original commit: elastic/x-pack-elasticsearch@1acb452cf0
* [DOCS] Added ML forecast API
* [DOCS] Added forecast API to build.gradle
* [DOCS] Added forecast API example
* [DOCS] Fixed forecast API intro
* [DOCS] Addressed feedback on forecast API
* [DOCS] Added duration to forecast API
* [DOCS] Removed end time from forecast API
* [DOCS] Fixed gradle errors for forecast API
Original commit: elastic/x-pack-elasticsearch@db79e3d5bb
* [DOCS] Enabled code snippet testing for put datafeed API
* [DOCS] Addressed gradle errors in put datafeed API
* [DOCS] Added job creation test to build.gradle
Original commit: elastic/x-pack-elasticsearch@3548d920c7
For the purpose of getting this API consumed by our UI, returning
overall buckets that match the job's largest `bucket_span` can
result in too much data. The UI only ever displays a few buckets
in the swimlane. Their span depends on the time range selected and
the screen resolution, but it will only ever be a relatively
low number.
This PR adds the ability to aggregate overall buckets in a user
specified `bucket_span`. That `bucket_span` may be equal or
greater to the largest job's `bucket_span`. The `overall_score`
of the result overall buckets is the max score of the
corresponding overall buckets with a span equal to the job's
largest `bucket_span`.
The implementation is now chunking the bucket requests
as otherwise the aggregation would fail when too many buckets
are matching.
Original commit: elastic/x-pack-elasticsearch@981f7a40e5
Adds the GET overall_buckets API.
The REST end point is: GET
/_xpack/ml/anomaly_detectors/job_id/results/overall_buckets
The API returns overall bucket results. An overall bucket
is a summarized bucket result over multiple jobs.
It has the `bucket_span` of the longest job's `bucket_span`.
It also has an `overall_score` that is the `top_n` average of the
max anomaly scores per job.
relates elastic/x-pack-elasticsearch#2693
Original commit: elastic/x-pack-elasticsearch@ba6061482d