2017-06-19 21:23:58 -04:00
|
|
|
[role="xpack"]
|
2018-08-31 19:49:24 -04:00
|
|
|
[testenv="platinum"]
|
2017-04-04 18:26:39 -04:00
|
|
|
[[ml-post-data]]
|
2018-12-20 13:23:28 -05:00
|
|
|
=== Post data to jobs API
|
2017-12-14 13:52:49 -05:00
|
|
|
++++
|
2018-12-20 13:23:28 -05:00
|
|
|
<titleabbrev>Post data to jobs</titleabbrev>
|
2017-12-14 13:52:49 -05:00
|
|
|
++++
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2018-06-13 16:37:35 -04:00
|
|
|
Sends data to an anomaly detection job for analysis.
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2019-06-27 12:42:47 -04:00
|
|
|
[[ml-post-data-request]]
|
|
|
|
==== {api-request-title}
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2018-12-07 15:34:11 -05:00
|
|
|
`POST _ml/anomaly_detectors/<job_id>/_data`
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2019-06-27 16:58:42 -04:00
|
|
|
[[ml-post-data-prereqs]]
|
|
|
|
==== {api-prereq-title}
|
|
|
|
|
|
|
|
* If the {es} {security-features} are enabled, you must have `manage_ml` or
|
|
|
|
`manage` cluster privileges to use this API. See
|
|
|
|
{stack-ov}/security-privileges.html[Security privileges].
|
|
|
|
|
2019-06-27 12:42:47 -04:00
|
|
|
[[ml-post-data-desc]]
|
|
|
|
==== {api-description-title}
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2017-04-28 11:04:08 -04:00
|
|
|
The job must have a state of `open` to receive and process the data.
|
|
|
|
|
2017-10-11 05:47:07 -04:00
|
|
|
The data that you send to the job must use the JSON format. Multiple JSON
|
|
|
|
documents can be sent, either adjacent with no separator in between them or
|
|
|
|
whitespace separated. Newline delimited JSON (NDJSON) is a possible whitespace
|
|
|
|
separated format, and for this the `Content-Type` header should be set to
|
|
|
|
`application/x-ndjson`.
|
2017-04-28 11:04:08 -04:00
|
|
|
|
2017-10-11 05:47:07 -04:00
|
|
|
Upload sizes are limited to the Elasticsearch HTTP receive buffer size
|
|
|
|
(default 100 Mb). If your data is larger, split it into multiple chunks
|
|
|
|
and upload each one separately in sequential time order. When running in
|
2017-04-28 11:04:08 -04:00
|
|
|
real time, it is generally recommended that you perform many small uploads,
|
2017-04-27 14:17:06 -04:00
|
|
|
rather than queueing data to upload larger files.
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2017-04-27 13:51:48 -04:00
|
|
|
When uploading data, check the <<ml-datacounts,job data counts>> for progress.
|
|
|
|
The following records will not be processed:
|
|
|
|
|
|
|
|
* Records not in chronological order and outside the latency window
|
|
|
|
* Records with an invalid timestamp
|
|
|
|
|
|
|
|
//TBD link to Working with Out of Order timeseries concept doc
|
|
|
|
|
2017-10-11 05:47:07 -04:00
|
|
|
IMPORTANT: For each job, data can only be accepted from a single connection at
|
|
|
|
a time. It is not currently possible to post data to multiple jobs using wildcards
|
2017-04-27 14:17:06 -04:00
|
|
|
or a comma-separated list.
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2019-06-27 12:42:47 -04:00
|
|
|
[[ml-post-data-path-parms]]
|
|
|
|
==== {api-path-parms-title}
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2019-07-12 11:26:31 -04:00
|
|
|
`<job_id>`::
|
|
|
|
(Required, string) Identifier for the job.
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2019-06-27 12:42:47 -04:00
|
|
|
[[ml-post-data-query-parms]]
|
|
|
|
==== {api-query-parms-title}
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2019-07-12 11:26:31 -04:00
|
|
|
`reset_start`::
|
|
|
|
(Optional, string) Specifies the start of the bucket resetting range.
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2019-07-12 11:26:31 -04:00
|
|
|
`reset_end`::
|
|
|
|
(Optional, string) Specifies the end of the bucket resetting range.
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2019-06-27 12:42:47 -04:00
|
|
|
[[ml-post-data-request-body]]
|
|
|
|
==== {api-request-body-title}
|
2017-10-11 05:47:07 -04:00
|
|
|
|
|
|
|
A sequence of one or more JSON documents containing the data to be analyzed.
|
|
|
|
Only whitespace characters are permitted in between the documents.
|
|
|
|
|
2019-06-27 12:42:47 -04:00
|
|
|
[[ml-post-data-example]]
|
|
|
|
==== {api-examples-title}
|
2017-04-04 18:26:39 -04:00
|
|
|
|
2019-06-27 16:58:42 -04:00
|
|
|
The following example posts data from the `it_ops_new_kpi.json` file to the
|
|
|
|
`it_ops_new_kpi` job:
|
2017-04-06 10:56:46 -04:00
|
|
|
|
2017-04-04 18:26:39 -04:00
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
2017-04-06 10:56:46 -04:00
|
|
|
$ curl -s -H "Content-type: application/json"
|
2018-12-07 15:34:11 -05:00
|
|
|
-X POST http:\/\/localhost:9200/_ml/anomaly_detectors/it_ops_new_kpi/_data
|
2017-04-11 21:52:47 -04:00
|
|
|
--data-binary @it_ops_new_kpi.json
|
2017-04-04 18:26:39 -04:00
|
|
|
--------------------------------------------------
|
2017-04-06 10:56:46 -04:00
|
|
|
|
2019-06-27 16:58:42 -04:00
|
|
|
When the data is sent, you receive information about the operational progress of
|
|
|
|
the job. For example:
|
2017-04-21 11:23:27 -04:00
|
|
|
|
|
|
|
[source,js]
|
2017-04-06 10:56:46 -04:00
|
|
|
----
|
|
|
|
{
|
2017-04-11 21:52:47 -04:00
|
|
|
"job_id":"it_ops_new_kpi",
|
|
|
|
"processed_record_count":21435,
|
|
|
|
"processed_field_count":64305,
|
|
|
|
"input_bytes":2589063,
|
|
|
|
"input_field_count":85740,
|
2017-04-06 10:56:46 -04:00
|
|
|
"invalid_date_count":0,
|
|
|
|
"missing_field_count":0,
|
|
|
|
"out_of_order_timestamp_count":0,
|
2017-04-11 21:52:47 -04:00
|
|
|
"empty_bucket_count":16,
|
2017-04-06 10:56:46 -04:00
|
|
|
"sparse_bucket_count":0,
|
2017-04-11 21:52:47 -04:00
|
|
|
"bucket_count":2165,
|
|
|
|
"earliest_record_timestamp":1454020569000,
|
|
|
|
"latest_record_timestamp":1455318669000,
|
|
|
|
"last_data_time":1491952300658,
|
|
|
|
"latest_empty_bucket_timestamp":1454541600000,
|
|
|
|
"input_record_count":21435
|
2017-04-06 10:56:46 -04:00
|
|
|
}
|
|
|
|
----
|
|
|
|
|
2017-04-19 13:52:30 -04:00
|
|
|
For more information about these properties, see <<ml-jobstats,Job Stats>>.
|