2017-06-19 21:23:58 -04:00
|
|
|
[role="xpack"]
|
2018-08-31 19:49:24 -04:00
|
|
|
[testenv="platinum"]
|
2017-04-04 18:26:39 -04:00
|
|
|
[[ml-put-job]]
|
2017-12-14 13:52:49 -05:00
|
|
|
=== Create Jobs API
|
|
|
|
++++
|
|
|
|
<titleabbrev>Create Jobs</titleabbrev>
|
|
|
|
++++
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2018-06-13 16:37:35 -04:00
|
|
|
Instantiates a job.
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2017-06-06 16:42:47 -04:00
|
|
|
==== Request
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2018-12-07 15:34:11 -05:00
|
|
|
`PUT _ml/anomaly_detectors/<job_id>`
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2017-04-25 17:08:29 -04:00
|
|
|
//===== Description
|
2017-04-18 18:13:21 -04:00
|
|
|
|
2017-06-06 16:42:47 -04:00
|
|
|
==== Path Parameters
|
2017-04-04 18:26:39 -04:00
|
|
|
|
|
|
|
`job_id` (required)::
|
2018-01-25 12:23:56 -05:00
|
|
|
(string) Identifier for the job. This identifier can contain lowercase
|
|
|
|
alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must
|
|
|
|
start and end with alphanumeric characters.
|
2017-04-04 18:26:39 -04:00
|
|
|
|
|
|
|
|
2017-06-06 16:42:47 -04:00
|
|
|
==== Request Body
|
2017-04-04 18:26:39 -04:00
|
|
|
|
|
|
|
`analysis_config`::
|
2017-04-11 22:26:18 -04:00
|
|
|
(object) The analysis configuration, which specifies how to analyze the data.
|
2017-04-04 18:26:39 -04:00
|
|
|
See <<ml-analysisconfig, analysis configuration objects>>.
|
|
|
|
|
2017-04-11 21:52:47 -04:00
|
|
|
`analysis_limits`::
|
2017-08-17 15:52:29 -04:00
|
|
|
(object) Specifies runtime limits for the job. See
|
|
|
|
<<ml-apilimits,analysis limits>>.
|
2017-04-11 21:52:47 -04:00
|
|
|
|
2017-08-18 16:00:15 -04:00
|
|
|
`background_persist_interval`::
|
|
|
|
(time units) Advanced configuration option. The time between each periodic
|
|
|
|
persistence of the model. See <<ml-job-resource>>.
|
|
|
|
|
|
|
|
`custom_settings`::
|
|
|
|
(object) Advanced configuration option. Contains custom meta data about the
|
|
|
|
job. See <<ml-job-resource>>.
|
|
|
|
|
2017-05-17 11:38:04 -04:00
|
|
|
`data_description` (required)::
|
|
|
|
(object) Describes the format of the input data. This object is required, but
|
|
|
|
it can be empty (`{}`). See <<ml-datadescription,data description objects>>.
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2017-04-11 21:52:47 -04:00
|
|
|
`description`::
|
2017-08-17 15:52:29 -04:00
|
|
|
(string) A description of the job.
|
|
|
|
|
|
|
|
`groups`::
|
|
|
|
(array of strings) A list of job groups. See <<ml-job-resource>>.
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2017-08-18 17:59:25 -04:00
|
|
|
`model_plot_config`::
|
2017-08-18 16:00:15 -04:00
|
|
|
(object) Advanced configuration option. Specifies to store model information
|
|
|
|
along with the results. This adds overhead to the performance of the system
|
|
|
|
and is not feasible for jobs with many entities, see <<ml-apimodelplotconfig>>.
|
2017-06-28 10:32:32 -04:00
|
|
|
|
2017-04-11 21:52:47 -04:00
|
|
|
`model_snapshot_retention_days`::
|
2017-04-11 22:26:18 -04:00
|
|
|
(long) The time in days that model snapshots are retained for the job.
|
2018-02-21 11:58:17 -05:00
|
|
|
Older snapshots are deleted. The default value is `1`, which means snapshots
|
|
|
|
are retained for one day (twenty-four hours).
|
2017-04-11 21:52:47 -04:00
|
|
|
|
2017-08-18 16:00:15 -04:00
|
|
|
`renormalization_window_days`::
|
|
|
|
(long) Advanced configuration option. The period over which adjustments to the
|
|
|
|
score are applied, as new data is seen. See <<ml-job-resource>>.
|
|
|
|
|
2017-04-11 21:52:47 -04:00
|
|
|
`results_index_name`::
|
2018-11-12 16:08:57 -05:00
|
|
|
(string) A text string that affects the name of the {ml} results index. The
|
|
|
|
default value is `shared`, which generates an index named `.ml-anomalies-shared`.
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2017-08-18 16:00:15 -04:00
|
|
|
`results_retention_days`::
|
|
|
|
(long) Advanced configuration option. The number of days for which job results
|
|
|
|
are retained. See <<ml-job-resource>>.
|
2017-04-25 17:08:29 -04:00
|
|
|
|
2017-06-06 16:42:47 -04:00
|
|
|
==== Authorization
|
2017-04-25 17:08:29 -04:00
|
|
|
|
|
|
|
You must have `manage_ml`, or `manage` cluster privileges to use this API.
|
2017-06-19 21:23:58 -04:00
|
|
|
For more information, see
|
|
|
|
{xpack-ref}/security-privileges.html[Security Privileges].
|
2017-08-18 16:00:15 -04:00
|
|
|
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2017-06-06 16:42:47 -04:00
|
|
|
==== Examples
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2018-02-09 12:16:24 -05:00
|
|
|
The following example creates the `total-requests` job:
|
2017-04-04 18:26:39 -04:00
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
2018-12-07 15:34:11 -05:00
|
|
|
PUT _ml/anomaly_detectors/total-requests
|
2017-04-04 18:26:39 -04:00
|
|
|
{
|
2018-02-09 12:16:24 -05:00
|
|
|
"description" : "Total sum of requests",
|
|
|
|
"analysis_config" : {
|
|
|
|
"bucket_span":"10m",
|
|
|
|
"detectors": [
|
|
|
|
{
|
|
|
|
"detector_description": "Sum of total",
|
|
|
|
"function": "sum",
|
|
|
|
"field_name": "total"
|
|
|
|
}
|
|
|
|
]
|
|
|
|
},
|
|
|
|
"data_description" : {
|
|
|
|
"time_field":"timestamp",
|
|
|
|
"time_format": "epoch_ms"
|
|
|
|
}
|
2017-04-04 18:26:39 -04:00
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
// CONSOLE
|
2018-08-31 19:49:24 -04:00
|
|
|
// TEST[skip:need-licence]
|
2017-04-04 18:26:39 -04:00
|
|
|
|
|
|
|
When the job is created, you receive the following results:
|
2017-04-21 11:23:27 -04:00
|
|
|
[source,js]
|
2017-04-04 18:26:39 -04:00
|
|
|
----
|
|
|
|
{
|
2018-02-09 12:16:24 -05:00
|
|
|
"job_id": "total-requests",
|
2017-04-11 21:52:47 -04:00
|
|
|
"job_type": "anomaly_detector",
|
2017-08-17 15:52:29 -04:00
|
|
|
"job_version": "7.0.0-alpha1",
|
2018-02-09 12:16:24 -05:00
|
|
|
"description": "Total sum of requests",
|
|
|
|
"create_time": 1517011406091,
|
2017-04-04 18:26:39 -04:00
|
|
|
"analysis_config": {
|
2018-02-09 12:16:24 -05:00
|
|
|
"bucket_span": "10m",
|
2017-04-04 18:26:39 -04:00
|
|
|
"detectors": [
|
|
|
|
{
|
2018-02-09 12:16:24 -05:00
|
|
|
"detector_description": "Sum of total",
|
|
|
|
"function": "sum",
|
|
|
|
"field_name": "total",
|
2017-06-02 05:26:01 -04:00
|
|
|
"detector_index": 0
|
2017-04-04 18:26:39 -04:00
|
|
|
}
|
|
|
|
],
|
2017-04-11 21:52:47 -04:00
|
|
|
"influencers": []
|
2017-04-04 18:26:39 -04:00
|
|
|
},
|
2017-08-18 14:29:08 -04:00
|
|
|
"analysis_limits": {
|
[ML] Set explicit defaults to AnalysisLimits (elastic/x-pack-elasticsearch#4015)
Analysis limits contain settings that affect the resources
used by ML jobs. Those limits always take place. However,
explictly setting them is not required as they have reasonable
defaults. For a long time those defaults lived on the c++ side.
The job could just not have any explicit limits and that meant
defaults would be used at the c++ side. This has the disadvantage
that it is not obvious to the users what these settings are set to.
Additionally, users might not be aware of the settings existence.
On top of that, since 6.1, the default model_memory_limit was lowered
from 4GB to 1GB. For BWC, this meant that jobs where model_memory_limit
is null, the default of 4GB applies. Jobs that were created from 6.1
onwards, contain an explicit setting for model_memory_limit, which is
1GB unless the user sets it differently. This adds additional confusion.
This commit makes analysis limits an always explicit setting on the job.
Regardless of whether the user sets custom limits or not, the job object
(and response) will contain the full analysis limits values.
The possibilities for interpretation of missing values are:
- the entire analysis_limits is null: this may only happen for jobs
created prior to 6.1. Thus we set the model_memory_limit to 4GB.
- analysis_limits are non-null but model_memory_limit is: this also
may only happen for jobs prior to 6.1. Again, we set memory limit to
4GB.
- model_memory_limit is non-null: this either means the user set an
explicit value or the job was created from 6.1 onwards and it has
the explicit default of 1GB. We simply keep the given value.
For categorization_examples_limit the default has always been 4, so
we fill that in when it's missing.
Finally, note that we still need to handle potential null values
for the situation of a mixed cluster.
Original commit: elastic/x-pack-elasticsearch@5b6994ef750298a829dd2995664470cd4cc95e07
2018-02-27 12:49:05 -05:00
|
|
|
"model_memory_limit": "1024mb",
|
|
|
|
"categorization_examples_limit": 4
|
2017-08-18 14:29:08 -04:00
|
|
|
},
|
2017-04-04 18:26:39 -04:00
|
|
|
"data_description": {
|
2018-02-09 12:16:24 -05:00
|
|
|
"time_field": "timestamp",
|
2017-04-04 18:26:39 -04:00
|
|
|
"time_format": "epoch_ms"
|
|
|
|
},
|
|
|
|
"model_snapshot_retention_days": 1,
|
|
|
|
"results_index_name": "shared"
|
|
|
|
}
|
|
|
|
----
|
2018-02-09 12:16:24 -05:00
|
|
|
// TESTRESPONSE[s/"job_version": "7.0.0-alpha1"/"job_version": $body.job_version/]
|
|
|
|
// TESTRESPONSE[s/"create_time": 1517011406091/"create_time": $body.create_time/]
|