[DOCS] ML API docs review (elastic/x-pack-elasticsearch#1169)
* [DOCS] Fix for prelertcategory * [DOCS] _preview returns a page of data * [DOCS] Added adv options e.g. background_persist_interval" * [DOCS] Clarify meanings of model_snapshot params * [DOCS] Format fixes * [DOCS] Include _all keyword * [DOCS] Explain retain. * [DOCS] Further explanations for model size limits * [DOCS] Format fixes in quick ref * [DOCS] Update for exclude_interim * [DOCS] Update for exclude_interim * [DOCS] Update for exclude_interim Original commit: elastic/x-pack-elasticsearch@cdd2fcefdd
This commit is contained in:
parent
2c2261881d
commit
528ac3d902
|
@ -13,7 +13,7 @@ The main {ml} resources can be accessed with a variety of endpoints:
|
|||
* <<ml-api-jobs,+/anomaly_detectors/+>>: Create and manage {ml} jobs.
|
||||
* <<ml-api-datafeeds,+/datafeeds/+>>: Update data to be analyzed.
|
||||
* <<ml-api-results,+/results/+>>: Access the results of a {ml} job.
|
||||
* <<ml-api-snapshots,+/modelsnapshots/+>>: Manage model snapshots.
|
||||
* <<ml-api-snapshots,+/model_snapshots/+>>: Manage model snapshots.
|
||||
* <<ml-api-validate,+/validate/+>>: Validate subsections of job configurations.
|
||||
|
||||
[float]
|
||||
|
@ -22,7 +22,7 @@ The main {ml} resources can be accessed with a variety of endpoints:
|
|||
|
||||
* <<ml-put-job,POST /anomaly_detectors>>: Create a job
|
||||
* <<ml-open-job,POST /anomaly_detectors/<job_id>/_open>>: Open a job
|
||||
* <<ml-post-data,POST anomaly_detectors/<job_id>/_data>>: Send data to a job
|
||||
* <<ml-post-data,POST /anomaly_detectors/<job_id>/_data>>: Send data to a job
|
||||
* <<ml-get-job,GET /anomaly_detectors>>: List jobs
|
||||
* <<ml-get-job,GET /anomaly_detectors/<job_id+++>+++>>: Get job details
|
||||
* <<ml-get-job-stats,GET /anomaly_detectors/<job_id>/_stats>>: Get job statistics
|
||||
|
@ -35,15 +35,15 @@ The main {ml} resources can be accessed with a variety of endpoints:
|
|||
[[ml-api-datafeeds]]
|
||||
=== /datafeeds/
|
||||
|
||||
* <<ml-put-datafeed,PUT /datafeeds/<datafeedID+++>+++>>: Create a data feed
|
||||
* <<ml-start-datafeed,POST /datafeeds/<feed_id>/_start>>: Start a data feed
|
||||
* <<ml-put-datafeed,PUT /datafeeds/<datafeed_id+++>+++>>: Create a data feed
|
||||
* <<ml-start-datafeed,POST /datafeeds/<datafeed_id>/_start>>: Start a data feed
|
||||
* <<ml-get-datafeed,GET /datafeeds>>: List data feeds
|
||||
* <<ml-get-datafeed,GET /datafeeds/<feed_id+++>+++>>: Get data feed details
|
||||
* <<ml-get-datafeed-stats,GET /datafeeds/<feed_id>/_stats>>: Get statistical information for data feeds
|
||||
* <<ml-preview-datafeed,GET /datafeeds/<feed_id>/_preview>>: Get a preview of a data feed
|
||||
* <<ml-update-datafeed,POST /datafeeds/<feedid>/_update>>: Update certain settings for a data feed
|
||||
* <<ml-stop-datafeed,POST /datafeeds/<feed_id>/_stop>>: Stop a data feed
|
||||
* <<ml-delete-datafeed,DELETE /datafeeds/<feed_id+++>+++>>: Delete data feed
|
||||
* <<ml-get-datafeed,GET /datafeeds/<datafeed_id+++>+++>>: Get data feed details
|
||||
* <<ml-get-datafeed-stats,GET /datafeeds/<datafeed_id>/_stats>>: Get statistical information for data feeds
|
||||
* <<ml-preview-datafeed,GET /datafeeds/<datafeed_id>/_preview>>: Get a preview of a data feed
|
||||
* <<ml-update-datafeed,POST /datafeeds/<datafeedid>/_update>>: Update certain settings for a data feed
|
||||
* <<ml-stop-datafeed,POST /datafeeds/<datafeed_id>/_stop>>: Stop a data feed
|
||||
* <<ml-delete-datafeed,DELETE /datafeeds/<datafeed_id+++>+++>>: Delete data feed
|
||||
|
||||
[float]
|
||||
[[ml-api-results]]
|
||||
|
|
|
@ -64,11 +64,11 @@ progress of a data feed. For example:
|
|||
The node that is running the query?
|
||||
`id`::: TBD. For example, "0-o0tOoRTwKFZifatTWKNw".
|
||||
`name`::: TBD. For example, "0-o0tOo".
|
||||
`ephemeral_id::: TBD. For example, "DOZltLxLS_SzYpW6hQ9hyg".
|
||||
`transport_address::: TBD. For example, "127.0.0.1:9300".
|
||||
`ephemeral_id`::: TBD. For example, "DOZltLxLS_SzYpW6hQ9hyg".
|
||||
`transport_address`::: TBD. For example, "127.0.0.1:9300".
|
||||
`attributes`::: TBD. For example, {"max_running_jobs": "10"}.
|
||||
|
||||
`state`::
|
||||
(string) The status of the data feed, which can be one of the following values: +
|
||||
started::: The data feed is actively receiving data.
|
||||
stopped::: The data feed is stopped and will not receive data until it is re-started.
|
||||
`started`::: The data feed is actively receiving data.
|
||||
`stopped`::: The data feed is stopped and will not receive data until it is re-started.
|
||||
|
|
|
@ -45,8 +45,8 @@ roles provide these privileges. For more information, see
|
|||
`from`::
|
||||
(integer) Skips the specified number of buckets.
|
||||
|
||||
`include_interim`::
|
||||
(boolean) If true, the output includes interim results.
|
||||
`exclude_interim`::
|
||||
(boolean) If true, the output excludes interim results. These are included by default.
|
||||
|
||||
`size`::
|
||||
(integer) Specifies the maximum number of buckets to obtain.
|
||||
|
|
|
@ -23,8 +23,7 @@ privileges to use this API. For more information, see <<privileges-list-cluster>
|
|||
|
||||
`feed_id`::
|
||||
(string) Identifier for the data feed.
|
||||
If you do not specify this optional parameter, the API returns information
|
||||
about all data feeds.
|
||||
Does not support wildcards, however you may specify `_all` to get information about all data feeds.
|
||||
|
||||
===== Results
|
||||
|
||||
|
|
|
@ -22,8 +22,7 @@ privileges to use this API. For more information, see <<privileges-list-cluster>
|
|||
|
||||
`feed_id`::
|
||||
(string) Identifier for the data feed.
|
||||
If you do not specify this optional parameter, the API returns information
|
||||
about all data feeds.
|
||||
Does not support wildcards, however you may specify `_all` or leave blank to get information about all data feeds.
|
||||
|
||||
===== Results
|
||||
|
||||
|
|
|
@ -34,8 +34,8 @@ roles provide these privileges. For more information, see
|
|||
`from`::
|
||||
(integer) Skips the specified number of influencers.
|
||||
|
||||
`include_interim`::
|
||||
(boolean) If true, the output includes interim results.
|
||||
`exclude_interim`::
|
||||
(boolean) If true, the output excludes interim results. These are included by default.
|
||||
|
||||
`influencer_score`::
|
||||
(double) Returns influencers with anomaly scores higher than this value.
|
||||
|
|
|
@ -19,8 +19,8 @@ privileges to use this API. For more information, see <<privileges-list-cluster>
|
|||
===== Path Parameters
|
||||
|
||||
`job_id`::
|
||||
(string) Identifier for the job. If you do not specify this optional parameter,
|
||||
the API returns information about all jobs.
|
||||
(string) A required identifier for the job.
|
||||
Does not support wildcards, however you may specify `_all` to get information about all jobs.
|
||||
|
||||
|
||||
===== Results
|
||||
|
|
|
@ -19,8 +19,8 @@ privileges to use this API. For more information, see <<privileges-list-cluster>
|
|||
===== Path Parameters
|
||||
|
||||
`job_id`::
|
||||
(string) Identifier for the job. If you do not specify this optional parameter,
|
||||
the API returns information about all jobs.
|
||||
(string) Identifier for the job.
|
||||
Does not support wildcards, however you may specify `_all` or leave blank to get information about all jobs.
|
||||
|
||||
===== Results
|
||||
|
||||
|
|
|
@ -33,8 +33,8 @@ roles provide these privileges. For more information, see
|
|||
`from`::
|
||||
(integer) Skips the specified number of records.
|
||||
|
||||
`include_interim`::
|
||||
(boolean) If true, the output includes interim results.
|
||||
`exclude_interim`::
|
||||
(boolean) If true, the output excludes interim results. These are included by default.
|
||||
|
||||
`record_score`::
|
||||
(double) Returns records with anomaly scores higher than this value.
|
||||
|
|
|
@ -12,6 +12,13 @@ A job resource has the following properties:
|
|||
(object) Defines approximate limits on the memory resource requirements for the job.
|
||||
See <<ml-apilimits,analysis limits>>.
|
||||
|
||||
`background_persist_interval`::
|
||||
(time units) Advanced configuration option.
|
||||
The time between each periodic persistence of the model.
|
||||
The default value is a randomized value between 3 to 4 hours which avoid all jobs persisting at exactly the same time.
|
||||
For very large models (several GB), persistence could take 10-20 minutes, so please do not set this value too low.
|
||||
The smallest allowed value is 1 hour.
|
||||
|
||||
`create_time`::
|
||||
(string) The time the job was created, in ISO 8601 format.
|
||||
For example, `1491007356077`.
|
||||
|
@ -29,7 +36,7 @@ A job resource has the following properties:
|
|||
|
||||
`job_id`::
|
||||
(string) The unique identifier for the job.
|
||||
|
||||
|
||||
`job_type`::
|
||||
(string) Reserved for future use, currently set to `anomaly_detector`.
|
||||
|
||||
|
@ -45,11 +52,22 @@ A job resource has the following properties:
|
|||
(long) The time in days that model snapshots are retained for the job.
|
||||
Older snapshots are deleted. The default value is 1 day.
|
||||
|
||||
`renormalization_window_days`::
|
||||
(long) Advanced configuration option.
|
||||
The period over which adjustments to the score are applied, as new data is seen.
|
||||
The default value is the longer of 30 days or 100 `bucket_spans`.
|
||||
|
||||
`results_index_name`::
|
||||
(string) The name of the index in which to store the {ml} results.
|
||||
The default value is `shared`,
|
||||
which corresponds to the index name `.ml-anomalies-shared`
|
||||
|
||||
`results_retention_days`::
|
||||
(long) Advanced configuration option.
|
||||
The number of days for which job results are retained.
|
||||
Once per day at 00:30 (server time), results older than this period will be deleted from Elasticsearch.
|
||||
The default value is null, i.e. results are retained.
|
||||
|
||||
[[ml-analysisconfig]]
|
||||
===== Analysis Configuration Objects
|
||||
|
||||
|
@ -62,7 +80,7 @@ An analysis configuration object has the following properties:
|
|||
`categorization_field_name`::
|
||||
(string) If not null, the values of the specified field will be categorized.
|
||||
The resulting categories can be used in a detector by setting `by_field_name`,
|
||||
`over_field_name`, or `partition_field_name` to the keyword `prelertcategory`.
|
||||
`over_field_name`, or `partition_field_name` to the keyword `mlcategory`.
|
||||
|
||||
`categorization_filters`::
|
||||
(array of strings) If `categorization_field_name` is specified,
|
||||
|
|
|
@ -6,20 +6,20 @@ The preview data feed API enables you to preview a data feed.
|
|||
|
||||
===== Request
|
||||
|
||||
`GET _xpack/ml/datafeeds/<feed_id>/_preview`
|
||||
`GET _xpack/ml/datafeeds/<datafeed_id>/_preview`
|
||||
|
||||
|
||||
===== Description
|
||||
|
||||
//TBD: How much data does it return?
|
||||
The API returns example data by using the current data feed settings.
|
||||
The API returns the first "page" of results from the `search` created using the current data feed settings.
|
||||
This shows the structure of the data that will be passed to the anomaly detection engine.
|
||||
|
||||
You must have `monitor_ml`, `monitor`, `manage_ml`, or `manage` cluster
|
||||
privileges to use this API. For more information, see <<privileges-list-cluster>>.
|
||||
|
||||
===== Path Parameters
|
||||
|
||||
`feed_id` (required)::
|
||||
`datafeed_id` (required)::
|
||||
(string) Identifier for the data feed
|
||||
|
||||
////
|
||||
|
@ -41,7 +41,7 @@ TBD
|
|||
////
|
||||
===== Examples
|
||||
|
||||
The following example obtains a previews of the `datafeed-farequote` data feed:
|
||||
The following example obtains a preview of the `datafeed-farequote` data feed:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
|
|
|
@ -2,13 +2,11 @@
|
|||
[[ml-snapshot-resource]]
|
||||
==== Model Snapshot Resources
|
||||
|
||||
////
|
||||
Model snapshots are saved to disk periodically.
|
||||
By default, this is occurs approximately every 3 hours.
|
||||
//TBD: Can you change this setting?
|
||||
By default, this is occurs approximately every 3 hours to 4 hours and is configurable using the setting `background_persist_interval`.
|
||||
|
||||
By default, model snapshots are retained for one day. You can change this
|
||||
behavior with by updating the `model_snapshot_retention_days` for the job.
|
||||
behavior by updating the `model_snapshot_retention_days` for the job.
|
||||
When choosing a new value, consider the following:
|
||||
|
||||
* Persistence enables resilience in the event of a system failure.
|
||||
|
@ -23,30 +21,31 @@ A model snapshot resource has the following properties:
|
|||
(string) An optional description of the job.
|
||||
|
||||
`job_id`::
|
||||
(string) A numerical character string that uniquely identifies the job.
|
||||
(string) A numerical character string that uniquely identifing the job that the snapshot was created for.
|
||||
|
||||
`latest_record_time_stamp`::
|
||||
() TBD. For example: 1455232663000.
|
||||
(date) The timestamp of the latest processed record.
|
||||
|
||||
`latest_result_time_stamp`::
|
||||
() TBD. For example: 1455229800000.
|
||||
(date) The timestamp of the latest bucket result.
|
||||
|
||||
`model_size_stats`::
|
||||
(object) TBD. See <<ml-snapshot-stats,Model Size Statistics>>.
|
||||
(object) Summary information describing the model. See <<ml-snapshot-stats,Model Size Statistics>>.
|
||||
|
||||
`retain`::
|
||||
(boolean) TBD. For example: false.
|
||||
(boolean) If true, this snapshot will not be deleted during automatic cleanup of snapshots older than `model_snapshot_retention_days`.
|
||||
However, this snapshot will be deleted when the job is deleted.
|
||||
The default value is false.
|
||||
|
||||
`snapshot_id`::
|
||||
(string) A numerical character string that uniquely identifies the model
|
||||
snapshot. For example: "1491852978".
|
||||
|
||||
`snapshot_doc_count`::
|
||||
() TBD. For example: 1.
|
||||
(long) For internal use only.
|
||||
|
||||
`timestamp`::
|
||||
(date) The creation timestamp for the snapshot, specified in ISO 8601 format.
|
||||
For example: 1491852978000.
|
||||
(date) The creation timestamp for the snapshot.
|
||||
|
||||
[float]
|
||||
[[ml-snapshot-stats]]
|
||||
|
@ -55,31 +54,37 @@ A model snapshot resource has the following properties:
|
|||
The `model_size_stats` object has the following properties:
|
||||
|
||||
`bucket_allocation_failures_count`::
|
||||
() TBD. For example: 0.
|
||||
(long) The number of buckets for which entites were not processed due to memory limit constraints.
|
||||
|
||||
`job_id`::
|
||||
(string) A numerical character string that uniquely identifies the job.
|
||||
|
||||
`log_time`::
|
||||
() TBD. For example: 1491852978000.
|
||||
(date) The timestamp that the `model_size_stats` were recorded, according to server-time.
|
||||
|
||||
`memory_status`::
|
||||
() TBD. For example: "ok".
|
||||
(string) The status of the memory in relation to its `model_memory_limit`.
|
||||
Contains one of the following values.
|
||||
`ok`::: The internal models stayed below the configured value.
|
||||
`soft_limit`::: The internal models require more than 60% of the configured memory limit and more aggressive pruning will
|
||||
be performed in order to try to reclaim space.
|
||||
`hard_limit`::: The internal models require more space that the configured memory limit.
|
||||
Some incoming data could not be processed.
|
||||
|
||||
`model_bytes`::
|
||||
() TBD. For example: 100393.
|
||||
(long) An approximation of the memory resources required for this analysis.
|
||||
|
||||
`result_type`::
|
||||
() TBD. For example: "model_size_stats".
|
||||
(string) Internal. This value is always set to "model_size_stats".
|
||||
|
||||
`timestamp`::
|
||||
() TBD. For example: 1455229800000.
|
||||
(date) The timestamp that the `model_size_stats` were recorded, according to the bucket timestamp of the data.
|
||||
|
||||
`total_by_field_count`::
|
||||
() TBD. For example: 13.
|
||||
(long) The number of _by_ field values analyzed. Note that these are counted separately for each detector and partition.
|
||||
|
||||
`total_over_field_count`::
|
||||
() TBD. For example: 0.
|
||||
(long) The number of _over_ field values analyzed. Note that these are counted separately for each detector and partition.
|
||||
|
||||
`total_partition_field_count`::
|
||||
() TBD. For example: 2.
|
||||
(long) The number of _partition_ field values analyzed.
|
||||
|
|
|
@ -13,7 +13,7 @@ The update job API allows you to update certain properties of a job.
|
|||
|
||||
You must have `manage_ml`, or `manage` cluster privileges to use this API.
|
||||
For more information, see <<privileges-list-cluster>>.
|
||||
//TBD: Important:: Updates do not take effect until after then job is closed and new data is sent to it.
|
||||
//TBD: Important:: Updates do not take effect until after then job is closed and re-opened.
|
||||
|
||||
===== Path Parameters
|
||||
|
||||
|
@ -34,7 +34,8 @@ The following properties can be updated after the job is created:
|
|||
* You can update the `analysis_limits` only while the job is closed.
|
||||
* The `model_memory_limit` property value cannot be decreased.
|
||||
* If the `memory_status` property in the `model_size_stats` object has a value of `hard_limit`,
|
||||
increasing the `model_memory_limit` is not recommended.
|
||||
this means that it was unable to process some data. You may wish to re-run this job
|
||||
with an increased `model_memory_limit`.
|
||||
|
||||
`description`::
|
||||
(string) An optional description of the job.
|
||||
|
|
|
@ -11,10 +11,10 @@ The update model snapshot API enables you to update certain properties of a snap
|
|||
|
||||
===== Description
|
||||
|
||||
//TBD. Is the following still true?
|
||||
//TBD. Is the following still true? - not sure but close/open would be the method
|
||||
|
||||
Updates to the configuration are only applied after the job has been closed
|
||||
and new data has been sent to it.
|
||||
and re-opened.
|
||||
|
||||
You must have `manage_ml`, or `manage` cluster privileges to use this API.
|
||||
For more information, see <<privileges-list-cluster>>.
|
||||
|
@ -32,10 +32,12 @@ For more information, see <<privileges-list-cluster>>.
|
|||
The following properties can be updated after the model snapshot is created:
|
||||
|
||||
`description`::
|
||||
(string) An optional description of the model snapshot.
|
||||
(string) An optional description of the model snapshot. E.g. "Before black friday"
|
||||
|
||||
`retain`::
|
||||
(boolean) TBD.
|
||||
(boolean) If true, this snapshot will not be deleted during automatic cleanup of snapshots older than `model_snapshot_retention_days`.
|
||||
Note that this snapshot will still be deleted when the job is deleted.
|
||||
The default value is false.
|
||||
|
||||
////
|
||||
===== Responses
|
||||
|
|
Loading…
Reference in New Issue