[DOCS] Added examples for all ML job APIs (elastic/x-pack-elasticsearch#980)

Original commit: elastic/x-pack-elasticsearch@7911b53af9
This commit is contained in:
Lisa Cawley 2017-04-06 07:56:46 -07:00 committed by lcawley
parent 5585283216
commit e339cf82df
12 changed files with 139 additions and 59 deletions

View File

@ -19,16 +19,16 @@ The main {ml} resources can be accessed with a variety of endpoints:
[[ml-api-jobs]]
=== /anomaly_detectors/
* <<ml-put-job,POST /anomaly_detectors>>: Create job
* <<ml-put-job,POST /anomaly_detectors>>: Create a job
* <<ml-open-job,POST /anomaly_detectors/<job_id>/_open>>: Open a job
* <<ml-post-data,POST anomaly_detectors/<job_id+++>+++>>: Send data to a job
* <<ml-post-data,POST anomaly_detectors/<job_id>/_data>>: Send data to a job
* <<ml-get-job,GET /anomaly_detectors>>: List jobs
* <<ml-get-job,GET /anomaly_detectors/<job_id+++>+++>>: Get job details
* <<ml-get-job-stats,GET /anomaly_detectors/<job_id>/_stats>>: Get job statistics
* <<ml-update-job,POST /anomaly_detectors/<job_id>/_update>>: Update certain properties of the job configuration
* <<ml-flush-job,POST anomaly_detectors/<job_id>/_flush>>: Force a job to analyze buffered data
* <<ml-close-job,POST /anomaly_detectors/<job_id>/_close>>: Close a job
* <<ml-delete-job,DELETE /anomaly_detectors/<job_id+++>+++>>: Delete job
* <<ml-delete-job,DELETE /anomaly_detectors/<job_id+++>+++>>: Delete a job
[float]
[[ml-api-datafeeds]]

View File

@ -24,6 +24,20 @@ include::ml/update-datafeed.asciidoc[]
[[ml-api-job-endpoint]]
=== Jobs
You can use APIs to perform the following activities:
* <<ml-close-job,Close jobs>>
* <<ml-put-job,Create jobs>>
* <<ml-delete-job,Delete jobs>>
* <<ml-get-job,Get job details>>
* <<ml-get-job-stats,Get job statistics>>
* <<ml-flush-job,Flush jobs>>
* <<ml-open-job,Open jobs>>
* <<ml-post-data,Post data to jobs>>
* <<ml-update-job,Update jobs>>
* <<ml-valid-detector,Validate detectors>>
* <<ml-valid-job,Validate job>>
include::ml/close-job.asciidoc[]
include::ml/put-job.asciidoc[]
include::ml/delete-job.asciidoc[]
@ -33,8 +47,8 @@ include::ml/flush-job.asciidoc[]
include::ml/open-job.asciidoc[]
include::ml/post-data.asciidoc[]
include::ml/update-job.asciidoc[]
include::ml/validate-job.asciidoc[]
include::ml/validate-detector.asciidoc[]
include::ml/validate-job.asciidoc[]
[[ml-api-snapshot-endpoint]]
=== Model Snapshots

View File

@ -32,7 +32,8 @@ data and analysis operations, however you can still explore and navigate results
===== Query Parameters
`close_timeout`::
(+time+; default: ++30 min++) Controls the time to wait until a job has closed
(+time+) Controls the time to wait until a job has closed.
The default value is 30 minutes.
////
===== Responses

View File

@ -47,3 +47,23 @@ A close operation additionally prunes and persists the model state to disk and t
412
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
////
===== Examples
The following example flushes the `event_rate` job:
[source,js]
--------------------------------------------------
POST _xpack/ml/anomaly_detectors/farequote/_flush
{
"calc_interim": true
}
--------------------------------------------------
// CONSOLE
// TEST[skip:todo]
When the operation succeeds, you receive the following results:
----
{
"flushed": true
}
----

View File

@ -5,9 +5,9 @@ The get data feed statistics API allows you to retrieve usage information for da
===== Request
`GET _xpack/datafeeds/_stats` +
`GET _xpack/ml/datafeeds/_stats` +
`GET _xpack/datafeeds/<feed_id>/_stats`
`GET _xpack/ml/datafeeds/<feed_id>/_stats`
////
===== Description

View File

@ -5,9 +5,9 @@ The get jobs API allows you to retrieve usage information for jobs.
===== Request
`GET _xpack/anomaly_detectors/_stats` +
`GET _xpack/ml/anomaly_detectors/_stats` +
`GET _xpack/anomaly_detectors/<job_id>/_stats`
`GET _xpack/ml/anomaly_detectors/<job_id>/_stats`
////
===== Description
@ -38,12 +38,15 @@ The API returns the following usage information:
`state`::
(+string+) The status of the job, which can be one of the following values:
running:: The job is actively receiving and processing data.
closed:: The job finished successfully with its model state persisted.
The job is still available to accept further data. NOTE: If you send data in a periodic cycle
and close the job at the end of each transaction, the job is marked as closed in the intervals
between when data is sent. For example, if data is sent every minute and it takes 1 second to process,
the job has a closed state for 59 seconds.
The job is still available to accept further data.
NOTE: If you send data in a periodic cycle and close the job at the end of each transaction,
the job is marked as closed in the intervals between when data is sent.
For example, if data is sent every minute and it takes 1 second to process, the job has a closed state for 59 seconds.
failed:: The job did not finish successfully due to an error. NOTE: This can occur due to invalid input data.
In this case, sending corrected data to a failed job re-opens the job and resets it to a running state.

View File

@ -214,7 +214,7 @@ A data description object has the following properties:
The value `epoch_ms` indicates that time is measured in milliseconds since the epoch.
The `epoch` and `epoch_ms` time formats accept either integer or real values. +
NOTE: Custom patterns must conform to the Java `DateTimeFormatter` class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: `yyyy-MM-ddTHH:mm:ssX`. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.
NOTE: Custom patterns must conform to the Java `DateTimeFormatter` class. When you use date-time formatting patterns, it is recommended that you provide the full date, time and time zone. For example: `yyyy-MM-dd'T'HH:mm:ssX`. If the pattern that you specify is not sufficient to produce a complete timestamp, job creation fails.
`quotecharacter`::
TBD

View File

@ -25,10 +25,11 @@ The job is ready to resume its analysis from where it left off, once new data is
===== Request Body
`open_timeout`::
(+time+; default: ++30 min++) Controls the time to wait until a job has opened
(+time+) Controls the time to wait until a job has opened.
The default value is 30 minutes.
`ignore_downtime`::
(+boolean+; default: ++true++) If true (default), any gap in data since it was
(+boolean+) If true (default), any gap in data since it was
last closed is treated as a maintenance window. That is to say, it is not an anomaly
////

View File

@ -6,7 +6,7 @@ The job must have been opened prior to sending data.
===== Request
`POST _xpack/ml/anomaly_detectors/<job_id> --data-binary @{data-file.json}`
`POST _xpack/ml/anomaly_detectors/<job_id>/_data --data-binary @<data-file.json>`
===== Description
@ -50,7 +50,36 @@ IMPORTANT: Data can only be accepted from a single connection.
////
===== Examples
The following example posts data from the farequote.json file to the `farequote` job:
[source,js]
--------------------------------------------------
$ curl -s -XPOST localhost:9200/_xpack/ml/anomaly_detectors/my_analysis --data-binary @data-file.json
$ curl -s -H "Content-type: application/json"
-X POST http:\/\/localhost:9200/_xpack/ml/anomaly_detectors/farequote --data-binary @farequote.json
--------------------------------------------------
// CONSOLE
// TEST[skip:todo]
When the data is sent, you receive information about the operational progress of the job.
For example:
----
{
"job_id":"farequote",
"processed_record_count":86275,
"processed_field_count":172550,
"input_bytes":8678202,
"input_field_count":258825,
"invalid_date_count":0,
"missing_field_count":0,
"out_of_order_timestamp_count":0,
"empty_bucket_count":0,
"sparse_bucket_count":0,
"bucket_count":1440,
"earliest_record_timestamp":1454803200000,
"latest_record_timestamp":1455235196000,
"last_data_time":1491436182038,
"input_record_count":86275
}
----
For more information about these properties, see <<ml-jobcounts,Job Counts>>.

View File

@ -27,13 +27,12 @@ The following properties can be updated after the job is created:
See <<ml-analysisconfig, analysis configuration objects>>. In particular, the following properties can be updated: `categorization_filters`, `detector_description`, TBD.
`analysis_limits`::
Optionally specifies runtime limits for the job. See <<ml-apilimits,analysis limits>>.
[NOTE]
* You can update the `analysis_limits` only while the job is closed.
* The `model_memory_limit` property value cannot be decreased.
* If the `memory_status` property in the `model_size_stats` object has a value of `hard_limit`,
increasing the `model_memory_limit` is not recommended.
(+object+) Specifies runtime limits for the job.
See <<ml-apilimits,analysis limits>>. NOTE:
* You can update the `analysis_limits` only while the job is closed.
* The `model_memory_limit` property value cannot be decreased.
* If the `memory_status` property in the `model_size_stats` object has a value of `hard_limit`,
increasing the `model_memory_limit` is not recommended.
`description`::
(+string+) An optional description of the job.

View File

@ -1,7 +1,7 @@
[[ml-valid-detector]]
==== Validate Detectors
TBD
The validate detectors API validates detector configuration information.
===== Request
@ -16,19 +16,14 @@ TBD
`job_id` (required)::
(+string+) Identifier for the job
////
===== Request Body
TBD
For a list of the properties that you can specify in the body of this API,
see <<ml-detectorconfig,detector configuration objects>>.
////
`open_timeout`::
(+time+; default: ++30 min++) Controls the time to wait until a job has opened
`ignore_downtime`::
(+boolean+; default: ++true++) If true (default), any gap in data since it was
last closed is treated as a maintenance window. That is to say, it is not an anomaly
===== Responses
200
@ -37,25 +32,26 @@ TBD
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
412
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
////
===== Examples
The following example opens the `event_rate` job:
The following example validates detector configuration information:
[source,js]
--------------------------------------------------
POST _xpack/ml/anomaly_detectors/event_rate/_open
POST _xpack/ml/anomaly_detectors/_validate/detector
{
"ignore_downtime":false
"function":"metric",
"field_name":"responsetime",
"by_field_name":"airline"
}
--------------------------------------------------
// CONSOLE
// TEST[skip:todo]
When the job opens, you receive the following results:
When the validation completes, you receive the following results:
----
{
"opened": true
"acknowledged": true
}
----
////

View File

@ -1,7 +1,7 @@
[[ml-valid-job]]
==== Validate Jobs
TBD
The validate jobs API validates job configuration information.
===== Request
@ -19,16 +19,21 @@ TBD
////
===== Request Body
TBD
`description`::
(+string+) An optional description of the job.
`analysis_config`::
(+object+) The analysis configuration, which specifies how to analyze the data.
See <<ml-analysisconfig, analysis configuration objects>>.
`data_description`::
(+object+) Describes the format of the input data.
See <<ml-datadescription,data description objects>>.
`analysis_limits`::
Optionally specifies runtime limits for the job. See <<ml-apilimits,analysis limits>>.
////
`open_timeout`::
(+time+; default: ++30 min++) Controls the time to wait until a job has opened
`ignore_downtime`::
(+boolean+; default: ++true++) If true (default), any gap in data since it was
last closed is treated as a maintenance window. That is to say, it is not an anomaly
===== Responses
200
@ -37,25 +42,37 @@ TBD
(BasicFailedReply) The cluster specified by {cluster_id} cannot be found (code: clusters.cluster_not_found)
412
(BasicFailedReply) The Elasticsearch cluster has not been shutdown yet (code: clusters.cluster_plan_state_error)
////
===== Examples
The following example opens the `event_rate` job:
The following example validates job configuration information:
[source,js]
--------------------------------------------------
POST _xpack/ml/anomaly_detectors/event_rate/_open
POST _xpack/ml/anomaly_detectors/_validate
{
"ignore_downtime":false
"description" : "Unusual response times by airlines",
"analysis_config" : {
"bucket_span": "300S",
"detectors" :[
{
"function":"metric",
"field_name":"responsetime",
"by_field_name":"airline"}],
"influencers" : [ "airline" ]
},
"data_description" : {
"time_field":"time",
"time_format":"yyyy-MM-dd'T'HH:mm:ssX"
}
}
--------------------------------------------------
// CONSOLE
// TEST[skip:todo]
When the job opens, you receive the following results:
When the validation is complete, you receive the following results:
----
{
"opened": true
"acknowledged": true
}
----
////