159 lines
5.6 KiB
Plaintext
159 lines
5.6 KiB
Plaintext
//lcawley Verified example output 2017-04-11
|
|
[[ml-jobcounts]]
|
|
==== Job Counts
|
|
|
|
The get job statistics API provides information about the operational
|
|
progress of a job.
|
|
|
|
NOTE: Job count values are cumulative for the lifetime of a job. If a model snapshot is reverted
|
|
or old results are deleted, the job counts are not reset.
|
|
|
|
`data_counts`::
|
|
(object) An object that describes the number of records processed and any related error counts.
|
|
See <<ml-datacounts,data counts objects>>.
|
|
|
|
`job_id`::
|
|
(string) A numerical character string that uniquely identifies the job.
|
|
|
|
`model_size_stats`::
|
|
(object) An object that provides information about the size and contents of the model.
|
|
See <<ml-modelsizestats,model size stats objects>>
|
|
|
|
`state`::
|
|
(string) The status of the job, which can be one of the following values:
|
|
`open`::: The job is actively receiving and processing data.
|
|
`closed`::: The job finished successfully with its model state persisted.
|
|
The job is still available to accept further data.
|
|
`closing`::: TBD
|
|
`failed`::: The job did not finish successfully due to an error.
|
|
This situation can occur due to invalid input data. In this case,
|
|
sending corrected data to a failed job re-opens the job and
|
|
resets it to an open state.
|
|
|
|
NOTE: If you send data in a periodic cycle and close the job at the end of
|
|
each transaction, the job is marked as closed in the intervals between
|
|
when data is sent. For example, if data is sent every minute and it takes
|
|
1 second to process, the job has a closed state for 59 seconds.
|
|
|
|
[float]
|
|
[[ml-datacounts]]
|
|
===== Data Counts Objects
|
|
|
|
The `data_counts` object describes the number of records processed
|
|
and any related error counts. It has the following properties:
|
|
|
|
`bucket_count`::
|
|
(long) The number of bucket results produced by the job.
|
|
|
|
`earliest_record_timestamp`::
|
|
(string) The timestamp of the earliest chronologically ordered record.
|
|
The datetime string is in ISO 8601 format.
|
|
|
|
`empty_bucket_count`::
|
|
() TBD
|
|
|
|
`input_bytes`::
|
|
(long) The number of raw bytes read by the job.
|
|
|
|
`input_field_count`::
|
|
(long) The total number of record fields read by the job. This count includes
|
|
fields that are not used in the analysis.
|
|
|
|
`input_record_count`::
|
|
(long) The number of data records read by the job.
|
|
|
|
`invalid_date_count`::
|
|
(long) The number of records with either a missing date field or a date that could not be parsed.
|
|
|
|
`job_id`::
|
|
(string) A numerical character string that uniquely identifies the job.
|
|
|
|
`last_data_time`::
|
|
() TBD
|
|
|
|
`latest_record_timestamp`::
|
|
(string) The timestamp of the last chronologically ordered record.
|
|
If the records are not in strict chronological order, this value might not be
|
|
the same as the timestamp of the last record.
|
|
The datetime string is in ISO 8601 format.
|
|
|
|
`latest_sparse_bucket_timestamp`::
|
|
() TBD
|
|
|
|
`missing_field_count`::
|
|
(long) The number of records that are missing a field that the job is configured to analyze.
|
|
Records with missing fields are still processed because it is possible that not all fields are missing.
|
|
The value of `processed_record_count` includes this count. +
|
|
+
|
|
--
|
|
NOTE: If you are using data feeds or posting data to the job in JSON format, a
|
|
high `missing_field_count` is often not an indication of data issues. It is not
|
|
necessarily a cause for concern.
|
|
|
|
--
|
|
|
|
`out_of_order_timestamp_count`::
|
|
(long) The number of records that are out of time sequence and outside of the latency window.
|
|
These records are discarded, since jobs require time series data to be in ascending chronological order.
|
|
|
|
`processed_field_count`::
|
|
(long) The total number of fields in all the records that have been processed by the job.
|
|
Only fields that are specified in the detector configuration object contribute to this count.
|
|
The time stamp is not included in this count.
|
|
|
|
`processed_record_count`::
|
|
(long) The number of records that have been processed by the job.
|
|
This value includes records with missing fields, since they are nonetheless analyzed.
|
|
+
|
|
The following records are not processed:
|
|
* Records not in chronological order and outside the latency window
|
|
* Records with invalid timestamp
|
|
* Records filtered by an exclude transform
|
|
|
|
`sparse_bucket_count`::
|
|
() TBD
|
|
|
|
[float]
|
|
[[ml-modelsizestats]]
|
|
===== Model Size Stats Objects
|
|
|
|
The `model_size_stats` object has the following properties:
|
|
|
|
`bucket_allocation_failures_count`::
|
|
() TBD
|
|
|
|
`job_id`::
|
|
(string) A numerical character string that uniquely identifies the job.
|
|
|
|
`log_time`::
|
|
() TBD
|
|
|
|
`memory_status`::
|
|
(string) The status of the mathematical models. This property can have one of the following values:
|
|
`ok`::: The models stayed below the configured value.
|
|
`soft_limit`::: The models used more than 60% of the configured memory limit and older unused models will be pruned to free up space.
|
|
`hard_limit`::: The models used more space than the configured memory limit. As a result, not all incoming data was processed.
|
|
|
|
`model_bytes`::
|
|
(long) The number of bytes of memory used by the models. This is the maximum value since the
|
|
last time the model was persisted. If the job is closed, this value indicates the latest size.
|
|
|
|
`result_type`::
|
|
TBD
|
|
|
|
`total_by_field_count`::
|
|
(long) The number of `by` field values that were analyzed by the models.
|
|
|
|
NOTE: The `by` field values are counted separately for each detector and partition.
|
|
|
|
`total_over_field_count`::
|
|
(long) The number of `over` field values that were analyzed by the models.
|
|
|
|
NOTE: The `over` field values are counted separately for each detector and partition.
|
|
|
|
`total_partition_field_count`::
|
|
(long) The number of `partition` field values that were analyzed by the models.
|
|
|
|
`timestamp`::
|
|
TBD
|