112 lines
3.4 KiB
Plaintext
112 lines
3.4 KiB
Plaintext
[role="xpack"]
|
|
[testenv="platinum"]
|
|
[[ml-post-data]]
|
|
=== Post data to jobs API
|
|
++++
|
|
<titleabbrev>Post data to jobs</titleabbrev>
|
|
++++
|
|
|
|
Sends data to an anomaly detection job for analysis.
|
|
|
|
[[ml-post-data-request]]
|
|
==== {api-request-title}
|
|
|
|
`POST _ml/anomaly_detectors/<job_id>/_data`
|
|
|
|
[[ml-post-data-prereqs]]
|
|
==== {api-prereq-title}
|
|
|
|
* If the {es} {security-features} are enabled, you must have `manage_ml` or
|
|
`manage` cluster privileges to use this API. See
|
|
{stack-ov}/security-privileges.html[Security privileges].
|
|
|
|
[[ml-post-data-desc]]
|
|
==== {api-description-title}
|
|
|
|
The job must have a state of `open` to receive and process the data.
|
|
|
|
The data that you send to the job must use the JSON format. Multiple JSON
|
|
documents can be sent, either adjacent with no separator in between them or
|
|
whitespace separated. Newline delimited JSON (NDJSON) is a possible whitespace
|
|
separated format, and for this the `Content-Type` header should be set to
|
|
`application/x-ndjson`.
|
|
|
|
Upload sizes are limited to the Elasticsearch HTTP receive buffer size
|
|
(default 100 Mb). If your data is larger, split it into multiple chunks
|
|
and upload each one separately in sequential time order. When running in
|
|
real time, it is generally recommended that you perform many small uploads,
|
|
rather than queueing data to upload larger files.
|
|
|
|
When uploading data, check the <<ml-datacounts,job data counts>> for progress.
|
|
The following records will not be processed:
|
|
|
|
* Records not in chronological order and outside the latency window
|
|
* Records with an invalid timestamp
|
|
|
|
//TBD link to Working with Out of Order timeseries concept doc
|
|
|
|
IMPORTANT: For each job, data can only be accepted from a single connection at
|
|
a time. It is not currently possible to post data to multiple jobs using wildcards
|
|
or a comma-separated list.
|
|
|
|
[[ml-post-data-path-parms]]
|
|
==== {api-path-parms-title}
|
|
|
|
`<job_id>`::
|
|
(Required, string) Identifier for the job.
|
|
|
|
[[ml-post-data-query-parms]]
|
|
==== {api-query-parms-title}
|
|
|
|
`reset_start`::
|
|
(Optional, string) Specifies the start of the bucket resetting range.
|
|
|
|
`reset_end`::
|
|
(Optional, string) Specifies the end of the bucket resetting range.
|
|
|
|
[[ml-post-data-request-body]]
|
|
==== {api-request-body-title}
|
|
|
|
A sequence of one or more JSON documents containing the data to be analyzed.
|
|
Only whitespace characters are permitted in between the documents.
|
|
|
|
[[ml-post-data-example]]
|
|
==== {api-examples-title}
|
|
|
|
The following example posts data from the `it_ops_new_kpi.json` file to the
|
|
`it_ops_new_kpi` job:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
$ curl -s -H "Content-type: application/json"
|
|
-X POST http:\/\/localhost:9200/_ml/anomaly_detectors/it_ops_new_kpi/_data
|
|
--data-binary @it_ops_new_kpi.json
|
|
--------------------------------------------------
|
|
|
|
When the data is sent, you receive information about the operational progress of
|
|
the job. For example:
|
|
|
|
[source,js]
|
|
----
|
|
{
|
|
"job_id":"it_ops_new_kpi",
|
|
"processed_record_count":21435,
|
|
"processed_field_count":64305,
|
|
"input_bytes":2589063,
|
|
"input_field_count":85740,
|
|
"invalid_date_count":0,
|
|
"missing_field_count":0,
|
|
"out_of_order_timestamp_count":0,
|
|
"empty_bucket_count":16,
|
|
"sparse_bucket_count":0,
|
|
"bucket_count":2165,
|
|
"earliest_record_timestamp":1454020569000,
|
|
"latest_record_timestamp":1455318669000,
|
|
"last_data_time":1491952300658,
|
|
"latest_empty_bucket_timestamp":1454541600000,
|
|
"input_record_count":21435
|
|
}
|
|
----
|
|
|
|
For more information about these properties, see <<ml-jobstats,Job Stats>>.
|