OpenSearch/docs/reference/ml/apis/put-job.asciidoc

145 lines
4.0 KiB
Plaintext
Raw Normal View History

[role="xpack"]
[testenv="platinum"]
[[ml-put-job]]
=== Create Jobs API
++++
<titleabbrev>Create Jobs</titleabbrev>
++++
2018-06-13 16:37:35 -04:00
Instantiates a job.
==== Request
`PUT _ml/anomaly_detectors/<job_id>`
//===== Description
==== Path Parameters
`job_id` (required)::
(string) Identifier for the job. This identifier can contain lowercase
alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must
start and end with alphanumeric characters.
==== Request Body
`analysis_config`::
(object) The analysis configuration, which specifies how to analyze the data.
See <<ml-analysisconfig, analysis configuration objects>>.
`analysis_limits`::
(object) Specifies runtime limits for the job. See
<<ml-apilimits,analysis limits>>.
`background_persist_interval`::
(time units) Advanced configuration option. The time between each periodic
persistence of the model. See <<ml-job-resource>>.
`custom_settings`::
(object) Advanced configuration option. Contains custom meta data about the
job. See <<ml-job-resource>>.
`data_description` (required)::
(object) Describes the format of the input data. This object is required, but
it can be empty (`{}`). See <<ml-datadescription,data description objects>>.
`description`::
(string) A description of the job.
`groups`::
(array of strings) A list of job groups. See <<ml-job-resource>>.
`model_plot_config`::
(object) Advanced configuration option. Specifies to store model information
along with the results. This adds overhead to the performance of the system
and is not feasible for jobs with many entities, see <<ml-apimodelplotconfig>>.
`model_snapshot_retention_days`::
(long) The time in days that model snapshots are retained for the job.
Older snapshots are deleted. The default value is `1`, which means snapshots
are retained for one day (twenty-four hours).
`renormalization_window_days`::
(long) Advanced configuration option. The period over which adjustments to the
score are applied, as new data is seen. See <<ml-job-resource>>.
`results_index_name`::
(string) A text string that affects the name of the {ml} results index. The
default value is `shared`, which generates an index named `.ml-anomalies-shared`.
`results_retention_days`::
(long) Advanced configuration option. The number of days for which job results
are retained. See <<ml-job-resource>>.
==== Authorization
You must have `manage_ml`, or `manage` cluster privileges to use this API.
For more information, see
{xpack-ref}/security-privileges.html[Security Privileges].
==== Examples
The following example creates the `total-requests` job:
[source,js]
--------------------------------------------------
PUT _ml/anomaly_detectors/total-requests
{
"description" : "Total sum of requests",
"analysis_config" : {
"bucket_span":"10m",
"detectors": [
{
"detector_description": "Sum of total",
"function": "sum",
"field_name": "total"
}
]
},
"data_description" : {
"time_field":"timestamp",
"time_format": "epoch_ms"
}
}
--------------------------------------------------
// CONSOLE
// TEST[skip:need-licence]
When the job is created, you receive the following results:
[source,js]
----
{
"job_id": "total-requests",
"job_type": "anomaly_detector",
"job_version": "7.0.0-alpha1",
"description": "Total sum of requests",
"create_time": 1517011406091,
"analysis_config": {
"bucket_span": "10m",
"detectors": [
{
"detector_description": "Sum of total",
"function": "sum",
"field_name": "total",
"detector_index": 0
}
],
"influencers": []
},
"analysis_limits": {
[ML] Set explicit defaults to AnalysisLimits (elastic/x-pack-elasticsearch#4015) Analysis limits contain settings that affect the resources used by ML jobs. Those limits always take place. However, explictly setting them is not required as they have reasonable defaults. For a long time those defaults lived on the c++ side. The job could just not have any explicit limits and that meant defaults would be used at the c++ side. This has the disadvantage that it is not obvious to the users what these settings are set to. Additionally, users might not be aware of the settings existence. On top of that, since 6.1, the default model_memory_limit was lowered from 4GB to 1GB. For BWC, this meant that jobs where model_memory_limit is null, the default of 4GB applies. Jobs that were created from 6.1 onwards, contain an explicit setting for model_memory_limit, which is 1GB unless the user sets it differently. This adds additional confusion. This commit makes analysis limits an always explicit setting on the job. Regardless of whether the user sets custom limits or not, the job object (and response) will contain the full analysis limits values. The possibilities for interpretation of missing values are: - the entire analysis_limits is null: this may only happen for jobs created prior to 6.1. Thus we set the model_memory_limit to 4GB. - analysis_limits are non-null but model_memory_limit is: this also may only happen for jobs prior to 6.1. Again, we set memory limit to 4GB. - model_memory_limit is non-null: this either means the user set an explicit value or the job was created from 6.1 onwards and it has the explicit default of 1GB. We simply keep the given value. For categorization_examples_limit the default has always been 4, so we fill that in when it's missing. Finally, note that we still need to handle potential null values for the situation of a mixed cluster. Original commit: elastic/x-pack-elasticsearch@5b6994ef750298a829dd2995664470cd4cc95e07
2018-02-27 12:49:05 -05:00
"model_memory_limit": "1024mb",
"categorization_examples_limit": 4
},
"data_description": {
"time_field": "timestamp",
"time_format": "epoch_ms"
},
"model_snapshot_retention_days": 1,
"results_index_name": "shared"
}
----
// TESTRESPONSE[s/"job_version": "7.0.0-alpha1"/"job_version": $body.job_version/]
// TESTRESPONSE[s/"create_time": 1517011406091/"create_time": $body.create_time/]