2017-04-11 18:52:47 -07:00
|
|
|
//lcawley Verified example output 2017-04-11
|
2017-04-19 18:52:30 +01:00
|
|
|
[[ml-jobstats]]
|
|
|
|
==== Job Stats
|
2017-04-04 15:26:39 -07:00
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
The get job statistics API provides information about the operational
|
|
|
|
progress of a job.
|
2017-04-04 15:26:39 -07:00
|
|
|
|
2017-04-19 18:52:30 +01:00
|
|
|
`assignment_explanation`::
|
|
|
|
(string) For open jobs only, contains messages relating to the selection of an executing node.
|
2017-04-04 15:26:39 -07:00
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
`data_counts`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(object) An object that describes the number of records processed and any related error counts.
|
2017-04-11 18:52:47 -07:00
|
|
|
See <<ml-datacounts,data counts objects>>.
|
|
|
|
|
|
|
|
`job_id`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(string) A numerical character string that uniquely identifies the job.
|
2017-04-11 18:52:47 -07:00
|
|
|
|
|
|
|
`model_size_stats`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(object) An object that provides information about the size and contents of the model.
|
2017-04-11 18:52:47 -07:00
|
|
|
See <<ml-modelsizestats,model size stats objects>>
|
|
|
|
|
2017-04-19 18:52:30 +01:00
|
|
|
`node`::
|
|
|
|
(object) For open jobs only, contains information about the executing node.
|
|
|
|
See <<ml-stats-node,node object>>.
|
|
|
|
|
|
|
|
`open_time`::
|
|
|
|
(string) For open jobs only, the elapsed time for which the job has been open.
|
|
|
|
E.g. `28746386s`.
|
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
`state`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(string) The status of the job, which can be one of the following values:
|
2017-04-11 18:52:47 -07:00
|
|
|
|
2017-04-19 18:52:30 +01:00
|
|
|
`open`::: The job is available to receive and process data.
|
|
|
|
`closed`::: The job finished successfully with its model state persisted.
|
|
|
|
The job must be opened before it can accept further data.
|
|
|
|
`closing`::: The job close action is in progress and has not yet completed.
|
|
|
|
A closing job cannot accept further data.
|
2017-04-18 15:13:21 -07:00
|
|
|
`failed`::: The job did not finish successfully due to an error.
|
2017-04-19 18:52:30 +01:00
|
|
|
This situation can occur due to invalid input data.
|
|
|
|
If the job had irrevecobly failed, it must be force closed and then deleted.
|
|
|
|
If the datafeed can be corrected, the job can be closed and then re-opened.
|
2017-04-18 15:13:21 -07:00
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
[float]
|
2017-04-04 15:26:39 -07:00
|
|
|
[[ml-datacounts]]
|
|
|
|
===== Data Counts Objects
|
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
The `data_counts` object describes the number of records processed
|
2017-04-19 18:52:30 +01:00
|
|
|
and any related error counts.
|
|
|
|
|
|
|
|
The `data_count` values are cumulative for the lifetime of a job. If a model snapshot is reverted
|
|
|
|
or old results are deleted, the job counts are not reset.
|
2017-04-04 15:26:39 -07:00
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
`bucket_count`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(long) The number of bucket results produced by the job.
|
2017-04-04 15:26:39 -07:00
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
`earliest_record_timestamp`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(string) The timestamp of the earliest chronologically ordered record.
|
2017-04-11 18:52:47 -07:00
|
|
|
The datetime string is in ISO 8601 format.
|
2017-04-04 15:26:39 -07:00
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
`empty_bucket_count`::
|
2017-04-19 18:52:30 +01:00
|
|
|
(long) The number of buckets which did not contain any data. If your data contains many
|
|
|
|
empty buckets, consider increasing your `bucket_span` or using functions that are tolerant
|
|
|
|
to gaps in data such as `mean`, `non_null_sum` or `non_zero_count`.
|
2017-04-04 15:26:39 -07:00
|
|
|
|
|
|
|
`input_bytes`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(long) The number of raw bytes read by the job.
|
2017-04-04 15:26:39 -07:00
|
|
|
|
|
|
|
`input_field_count`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(long) The total number of record fields read by the job. This count includes
|
2017-04-04 15:26:39 -07:00
|
|
|
fields that are not used in the analysis.
|
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
`input_record_count`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(long) The number of data records read by the job.
|
2017-04-11 18:52:47 -07:00
|
|
|
|
2017-04-04 15:26:39 -07:00
|
|
|
`invalid_date_count`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(long) The number of records with either a missing date field or a date that could not be parsed.
|
2017-04-04 15:26:39 -07:00
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
`job_id`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(string) A numerical character string that uniquely identifies the job.
|
2017-04-11 18:52:47 -07:00
|
|
|
|
|
|
|
`last_data_time`::
|
2017-04-19 18:52:30 +01:00
|
|
|
(datetime) The timestamp at which data was last analyzed, according to server time.
|
|
|
|
|
|
|
|
`latest_empty_bucket_timestamp`::
|
|
|
|
(date) The timestamp of the last bucket that did not contain any data.
|
2017-04-11 18:52:47 -07:00
|
|
|
|
|
|
|
`latest_record_timestamp`::
|
2017-04-19 18:52:30 +01:00
|
|
|
(date) The timestamp of the last processed record.
|
2017-04-11 18:52:47 -07:00
|
|
|
|
|
|
|
`latest_sparse_bucket_timestamp`::
|
2017-04-19 18:52:30 +01:00
|
|
|
(date) The timestamp of the last bucket that was considered sparse.
|
2017-04-11 18:52:47 -07:00
|
|
|
|
2017-04-04 15:26:39 -07:00
|
|
|
`missing_field_count`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(long) The number of records that are missing a field that the job is configured to analyze.
|
2017-04-04 15:26:39 -07:00
|
|
|
Records with missing fields are still processed because it is possible that not all fields are missing.
|
2017-04-18 11:30:30 -07:00
|
|
|
The value of `processed_record_count` includes this count. +
|
|
|
|
+
|
|
|
|
--
|
|
|
|
NOTE: If you are using data feeds or posting data to the job in JSON format, a
|
|
|
|
high `missing_field_count` is often not an indication of data issues. It is not
|
|
|
|
necessarily a cause for concern.
|
|
|
|
|
|
|
|
--
|
2017-04-04 15:26:39 -07:00
|
|
|
|
|
|
|
`out_of_order_timestamp_count`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(long) The number of records that are out of time sequence and outside of the latency window.
|
2017-04-19 18:52:30 +01:00
|
|
|
This is only applicable when using the `_data` endpoint.
|
2017-04-04 15:26:39 -07:00
|
|
|
These records are discarded, since jobs require time series data to be in ascending chronological order.
|
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
`processed_field_count`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(long) The total number of fields in all the records that have been processed by the job.
|
2017-04-11 18:52:47 -07:00
|
|
|
Only fields that are specified in the detector configuration object contribute to this count.
|
|
|
|
The time stamp is not included in this count.
|
2017-04-04 15:26:39 -07:00
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
`processed_record_count`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(long) The number of records that have been processed by the job.
|
2017-04-11 18:52:47 -07:00
|
|
|
This value includes records with missing fields, since they are nonetheless analyzed.
|
|
|
|
+
|
2017-04-19 18:52:30 +01:00
|
|
|
When using datafeeds, the `processed_record_count` will differ from the `input_record_count`
|
|
|
|
if you are using aggregations in your search query.
|
|
|
|
+
|
|
|
|
When posting to the `/_data` endpoint, the following records are not processed:
|
2017-04-11 18:52:47 -07:00
|
|
|
* Records not in chronological order and outside the latency window
|
|
|
|
* Records with invalid timestamp
|
2017-04-04 15:26:39 -07:00
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
`sparse_bucket_count`::
|
2017-04-19 18:52:30 +01:00
|
|
|
(long) The number of buckets which contained few data points compared to the expected number
|
|
|
|
of data points. If your data contains many sparse buckets, consider using a longer `bucket_span`.
|
2017-04-04 15:26:39 -07:00
|
|
|
|
2017-04-11 18:52:47 -07:00
|
|
|
[float]
|
2017-04-04 15:26:39 -07:00
|
|
|
[[ml-modelsizestats]]
|
|
|
|
===== Model Size Stats Objects
|
|
|
|
|
|
|
|
The `model_size_stats` object has the following properties:
|
|
|
|
|
2017-04-11 19:26:18 -07:00
|
|
|
`bucket_allocation_failures_count`::
|
2017-04-19 18:52:30 +01:00
|
|
|
(long) The number of buckets for which new entites in incoming data was not processed due to a
|
|
|
|
insufficient model memory as signified by a `hard_limit` `memory_status`.
|
2017-04-11 19:26:18 -07:00
|
|
|
|
2017-04-04 15:26:39 -07:00
|
|
|
`job_id`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(string) A numerical character string that uniquely identifies the job.
|
2017-04-04 15:26:39 -07:00
|
|
|
|
2017-04-11 19:26:18 -07:00
|
|
|
`log_time`::
|
2017-04-19 18:52:30 +01:00
|
|
|
(date) The timestamp of the `model_size_stats` according to server time.
|
2017-04-11 19:26:18 -07:00
|
|
|
|
|
|
|
`memory_status`::
|
|
|
|
(string) The status of the mathematical models. This property can have one of the following values:
|
|
|
|
`ok`::: The models stayed below the configured value.
|
|
|
|
`soft_limit`::: The models used more than 60% of the configured memory limit and older unused models will be pruned to free up space.
|
|
|
|
`hard_limit`::: The models used more space than the configured memory limit. As a result, not all incoming data was processed.
|
2017-04-04 15:26:39 -07:00
|
|
|
|
|
|
|
`model_bytes`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(long) The number of bytes of memory used by the models. This is the maximum value since the
|
2017-04-04 15:26:39 -07:00
|
|
|
last time the model was persisted. If the job is closed, this value indicates the latest size.
|
|
|
|
|
2017-04-11 19:26:18 -07:00
|
|
|
`result_type`::
|
2017-04-19 18:52:30 +01:00
|
|
|
(string) For internal use. The type of result.
|
2017-04-11 19:26:18 -07:00
|
|
|
|
2017-04-04 15:26:39 -07:00
|
|
|
`total_by_field_count`::
|
2017-04-19 18:52:30 +01:00
|
|
|
(long) The number of `by` field values that were analyzed by the models.
|
2017-04-18 15:13:21 -07:00
|
|
|
+
|
|
|
|
--
|
2017-04-04 15:26:39 -07:00
|
|
|
NOTE: The `by` field values are counted separately for each detector and partition.
|
2017-04-18 15:13:21 -07:00
|
|
|
--
|
2017-04-19 18:52:30 +01:00
|
|
|
|
2017-04-04 15:26:39 -07:00
|
|
|
`total_over_field_count`::
|
2017-04-19 18:52:30 +01:00
|
|
|
(long) The number of `over` field values that were analyzed by the models.
|
2017-04-18 15:13:21 -07:00
|
|
|
+
|
|
|
|
--
|
2017-04-04 15:26:39 -07:00
|
|
|
NOTE: The `over` field values are counted separately for each detector and partition.
|
2017-04-18 15:13:21 -07:00
|
|
|
--
|
2017-04-19 18:52:30 +01:00
|
|
|
|
2017-04-04 15:26:39 -07:00
|
|
|
`total_partition_field_count`::
|
2017-04-11 19:26:18 -07:00
|
|
|
(long) The number of `partition` field values that were analyzed by the models.
|
2017-04-04 15:26:39 -07:00
|
|
|
|
|
|
|
`timestamp`::
|
2017-04-19 18:52:30 +01:00
|
|
|
(date) The timestamp of the `model_size_stats` according to the timestamp of the data.
|
|
|
|
|
|
|
|
[float]
|
|
|
|
[[ml-stats-node]]
|
|
|
|
===== Node Objects
|
|
|
|
|
|
|
|
The `node` objects contains properties of the executing node and is only available for open jobs.
|
|
|
|
|
|
|
|
`id`::
|
|
|
|
(string) The unique identifier of the executing node.
|
|
|
|
|
|
|
|
`name`::
|
|
|
|
(string) The node's name.
|
|
|
|
|
|
|
|
`ephemeral_id`::
|
|
|
|
|
|
|
|
|
|
|
|
`transport_address`::
|
|
|
|
(string) Host and port where transport HTTP connections are accepted.
|
|
|
|
|
|
|
|
`attributes`::
|
|
|
|
(object) {ml} attributes.
|
|
|
|
`max_running_jobs`::: The maximum number of concurrently open jobs allowed per node.
|